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I.  THE  NEED  FDR  A  DATA  DICTIONARY 

A.   BACKGROUND 

One  of  the  most  important  resources  of  a&  organization 
and  one  that  is  too  oftsn  overlooked  is  data.  People, 
dollars,  materials,  and  time  are  usually  well  controlled  and 
budgeted,  yet  the  data  about  an  organization  and  its  opera- 
tions is  often  managed  haphazardly,  if  at  all. 

Database  technology  has  made  possible  the  storage  and 
processing  of  an  organization's  data  as  an  integrated  whole 
and  allows  the  sharing  of  that  processed  data,  or  informa- 
tion, throughout  the  organization.  A  database  management 
system  (DBMS)  acts  as  a  librarian  for  the  database,  storing 
and  retrieving  data  according  to  a  particular  format 
[Raf.  1  ].  However,  a  DBMS  does  not  aecessarily  provide  for 
the  security,  integrity,  accountability,  or  maintainability 
of  data.  These  objectives  are  best  achieved  when  a  data 
dictionary  is  used  in  conjunction  with  the  DBMS. 

Simply  stated,  a  data  dictionary  is  a  central  repository 
of  descriptive  data  about  the  definition,  characteristics, 
location,  and  usage  of  the  data  found  in  an  organization.  A. 
fully  utilized  data  dictionary  will  control  the  collection, 
maintenance,  and  retrieval  of  this  data.  For  example,  if 
the  aircraft  carrier  U.S.S.  Constellation  had  a  data 
dictionary,  it  would  be  possible  to  ask.  questions  such  as 

of   data   is   contained   in   a   "Controlled 


What   type   of 
Equipage"  record 


How  many  programs  use  the  "Personnel"  file? 

IThich  departments   receive  the   "Ammunition  Transaction" 
report? 

ffhat  is   the  relationship   between  "Inventory   Item"  and 
"Reorder  Point"? 


In  which  records   is  the  field  "Social   Security  Number" 
found? 

Who   is   authorized  to  update   the   "Readiness   Status" 
field? 

Shat  is  the  range  of  values  for  'Readiness  Status"  data? 

In  which   database  is  the  "Preventive   Maintenance"  file 
found? 


Those  who  will  benefit  from  the  answsrs  to  tiase  questions 
include  not  only  the  ship's  data  administrator,  but  also 
programmers,  systems  development  personnel,  data  processing 
staff,  auditors,  and,  most  important,  end  users  at  every 
level  of  the  organization. 

Even  though  data  dictionary  software  has  been  available 
commercially  since  1970  and  the  advantages  and  benefits 
associated  with  data  dictionaries  are  widely  recognized, 
most  organizations  have  been  slow  to  implement  them,  and  the 
Department  of  Defense  is  no  exception.  A  recent  study  by 
ths  Committee  on  Review  of  Navy  Long-Range  Automatic  Data 
Processing  Planning  [Eef-  2]  points  oat  that 


7irtually  every  action  by  a  commander,  manager,  or 
administrator  in  the  Navy,  as  in  any  large  organization, 
involves  the  acquisition  and  understanding  of  informa- 
tion: information  about  the  organization,  about  its 
status,  about  its  resources,  about  its  environment.  His 
actions  usually  result  in  the  creation  and  promulgation 
of  policies  and  directives:  that  is,  information  for 
subordinates,  peers,  or  superiors. 


If  it  is  true  that  "the  benefit  derived  from  a  dictionary  is 
proportional  to  the  size  of  the  dictionary  itself,"  [Ref.  3] 
the  military  stands  to  gain  a  great  deal  from  the  implemen- 
tation of  data  dictionaries. 

At  present,  there  is  no  consansus  in  computing  litera- 
ture about  exactly  what  a  data  dictionary  should  do  or  what 
kind  of  data  dictionary  is  best  for  a  particular  organiza- 
tion. There  are  many  different  data  dictionary  packages  on 
the  market  from  which  to  choose;   nost  of  thesa  have  similar 
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features.  Therefore,  tha  potential  purchaser  of  a  lata 
dictionary  is  in  need  of  guidance  whan  making  this  choice. 
Tha  United  States  Governmeat  has  racognizad  this  problem  and 
has  identified  standards  for  data  dictionarias  in  Federal 
Information  Processing  Standards  promulgated  by  the  National 
Bureau  of  Standards.  An  understanding  of  thase  standards 
and  of  the  functions  and  objecti/es  of  a  data  dictionary 
wiLl  provide  the  reader  with  a  basis  on  which  to  evaluate 
data  dictionary  packages  and  to  usa  them  effectively. 

B.   PURPOSE  OF  THE  THESIS 

We  believe  that  it  is  important  for  managers  in  the 
military  to  understand  what  a  data  dictionary  is  and  what  it 
can  do  to  help  an  organization  manage  its  data.  Thus,  the 
purpose  of  this  thesis  is  to  provide  the  raader  with  an 
understanding  of  the  structure  and  functions  of  a  data 
dictionary,  guidelines  for  the  evaluation  and  selection  of  a 
data  dictionary,  and  an  analysis  of  several  .conmercial  data 
dictionary  products.  We  will  show  the  reader  how  the 
management  of  an  organizations  data  resourca  can  be  accom- 
plished by  maans  of  a  data  dictionary  and  will  recommend 
ways  for  the  cole  of  the  data  dictionary  to  be  axpanded. 
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II.  THE  ANATOMY  OF  A  DATA  DICTIONARY 

A.   INTRODOCTIOH 

Because  data  dictionary  technology  is  a  new  and  continu- 
ally evolving  field,  it  suffers  from  a  lack,  of  consistency 
in  its  terminology.  The  many  texts  and  articles  on  the 
subject  and  the  various  commercial  data  dictionary  products 
use  a  wide  variety  of  differing  terms.  The  data  dictionary 
itself  is  known  as  a  data  dictionary/directory,  a  data 
dictionary  system,  or  an  information  resource  management 
dictionary.  In  order  to  provide  a  base  of  reference  for  the 
remainder  of  this  thesis,  we  will  present  our  own  set  of 
definitions  distilled  from  our  references. 

Data  dictionaries  run  the  ganut  from  manual,  on-paper 
systems  to  highly  sophisticated  software  and  can  be  used 
both  in  database  and  non-database  environments.  ffe  will 
discuss  automated  data  dictionaries  only  as  they  relate  to  a 
database,  where  they  have  the  most  to  offer  the  potential 
user. 

In  order  to  assess  the  benefits  of  a  data  dictionary,  it 
is  necessary  to  understand  how  a  data  dictionary  is  orga- 
nized and  what  its  capabilities  are.  k  data  dictionary  does 
not  contain  the  actual  data  that  constitutes  an  organiza- 
tion's database;  instead,  it  is  itself  a  dataoase  called  a 
metadatabase  that  contains  metadata,  or  data  about  the  data- 
base data.  Two  types  of  metadata  are  found  in  a  data 
dictionary.  Dictionary  metadata  tells  what  data  exists,  the 
origins  of  the  data,  the  attributes  the  data  may  have,  how 
and  by  whom  the  data  may  be  used,  what  the  structure  of  the 
data  is,  and  what  the  relationships  between  the  data  are. 
Directory  metadata  tells  where  the   data  is  located,   how  it 
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cai  be  accessed,  and  what  its  physical  representation  within 
the  computer   is.    Together,   these   two  types   of  metadata 


DATA 
DICTIONARY 


\::::z 

DICTIONARY 
METADATA 


Data  Present 
Data  Origin 
Attributes 
Security/Access 
Data  Structure 
Relationships 


I 


DIRECTORY 
METADATA 
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-  Data  LDcation 

-  Access  Modes 

-  Physical 
Representation 


Figure  2. 1    Types  of  Data  Dictionary  Metadata 

provide  the  means  for  accessing  and  controlling  the  data  in 
the  database.  Figure  2.1  illustrates  this  division  of 
metadata. 

Data  dictionaries  fall  into  two  categories — free- 
standing and  DBMS-dependent.  Figure  2.2  shctfs  a  partial 
listing  of  some  commercial  data  dictionary  packages 
according  to  type.  A  free-standing,  data  dictionary  (also 
called  independent  or  stand-alone)  is  not  tied  to  any 
particular  database  management  system  (DBMS).  It  manages 
data  by  utilizing  software  routines  built  into  the  data 
dictionary  package  and  thus  is  not  dependent  on  DBMS  soft- 
ware. This  independence  provides  flexibility:  a  free- 
standing data  dictionary  can  have  the  capability  to  support 
more  than  one  type  of  DBM3.  However,  this  flexibility  is 
gained  at  the  cost  of  duplication  o£  data  descriptions  in 
the  database  and  the  data  dictionary. 
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Free-standing  Data  Dictionaries 

DATA  CMALOGQE  2  (19  74) 

-  Synergetus  Corporation 
DATA  DESIGNER  (1975) 

-  Database  Design,  Inc. 
PRIDE-LOSIK  (1974) 

-  H.  Bryce  S  Associates,  Inc. 
DATAMANAGER  (1975) 

-  Management  System^  Z    ? rogramming,  LTD 

DBMS-Dependent  Data  Dictioniries 

ADABAS  (1978) 

-  Software  AG  of  North  \merica,  Ins. 
DATA  DICTIONARY/DATACOM  (1979) 

-  Applied  Data  Research  (ADS) 
ORACLE  {1983) 

-  Relational  Software.  Inc. 
DB/DC  DATA  DICTIONARY  (1974| 

-  International  Business  Machines 


EDICT  (1976) 
-  Infodat 


a  Systems,  Inc. 


Figure  2.2   Free-standing  and  Dependent  Data  Dictionaries 

A  DBMS-dep_endent  data  dictionary  (also  called  merged  or 
integrated)  is  a  component  of  a  specific  database  management 
system;  it  uses  the  software  facilities  available  within  the 
DBMS  to  manage  the  data  in  the  database.  This  type  of  data 
dictionary  minimizes  redundancy  and  limits  tie  number  of 
possible  errocs  because  data  descriptions  exist  in  only  one 
place,  in  the  data  dictionary.  It  also  benefits  from  the 
sophisticated  backup  and  recovery  facilities  of  the  DBMS. 

A  data  dictionary  is  also  described  as  having  active  or 
passive  interfaces  or  a  combination  of  the  two.  An  inter- 
face is  a  series  of  commands  which  connect  the  data 
dictionary  with  other  software  such  as  compilers,  operating 
systems,  report  generators,  and  other  programs.  The  data 
dictionary  supports  these  applications  by  providing  the 
metadata  that  is   required  for  their  execution.     An  active 


14 


data  dictionary  is  one  in  which  information  Ls  created, 
accessed,  oc  modified  through  the  data  dictionary  inter- 
faces. New  or  changed  metadata  is  automatically  updated  and 
stored  in  tha  data  dictionary.  This  is  not  true  of  a 
Eiisive  data  dictionary:  when  new  metadata  is  generated, 
the  data  dictionary  may  or  may  not  be  automatically  updated 
and  when  data  is  retrieved,  it  may  be  accessed  through  the 
data  dictionary  or  directly  from  the  database. 

There  are  many  perspectives  from  which  to  look  at  the 
data  that  resides  in  a  database.  There  is  the  Ejiy.sical  (or 
internal)  view  that  consists  of  the  actual  physical  repre- 
sentation, format,  and  location  of  the  data  as  "seen"  by  the 
computer.  There  is  a  logical  (or  conceptual  or  global 
enterprise)  view  called  a  schema  which  describes  all  of  the 
data  in  the  database  in  its  logical  format,  i.e.,  what  types 
of  records  are  to  be  maintained,  the  contents  of  those 
records,  and  the  relationships  amoag  those  records.  This  is 
the  data  as  it  would  be  presented  to  a  human,  not  its  actual 
computer  format.  In  most  cases,  only  the  database  adminis- 
trator has  access  to  the  schema.  Another  view  is  the 
external  view,  also  called  a  subschema,  which  is  a  subset 
of  the  logical  view  tailored  to  a  particular  user  or  appli- 
cation. This  is  analogous  to  a  "window"  through  which  only 
a  portion  of  the  total  data  is  seen.  Subschemas  can  be 
utilized  to  implement  security  oy  restricting  a  user's 
azzBss   to  data. 

Figure  2.3  shows  the  three  different  perspectives  of 
data  in  a  sample  database  of  students  at  the  Naval 
Postgraduate  School.  (A)  is  the  computer's  physical  view 
and  thus  is  not  visible  to  the  human  user.  (3)  shows  the 
overall  logical  view  of  this  small  database.  (C)  is  a 
suoset  of  (B)  as  it  would  be  seen  by  a  user  who  is 
interested  in  only  a  portion  of  the  database--in  this  case, 
the  senior  Army  officer  who  wants  information  only  on  Army 
students. 
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RANK 


0-5 
0-3 
0-3 


(C) 


Figure  2.3    Views  Within  a  DBMS 

B.   THE  STRUCTORE  OF  A  DATA  DICTIONARY 

There  are  three  kinds  of   elements  upon  which  the  struc- 
ture, or  schEaa,   of  a  data  dictionary  is  built:    entities, 
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attributes,  and  relationships.  The  basic  element  of  the 
dictionary  is  the  entity.  Each  entity  has  a  unique  name  and 
represents  an  object  in  the  real  *orld,  such  as  a  person, 
thing,  or  idea  about  which  information  is  recorded.  For 
example,  in  our  Naval  Postgraduate  School  database  we 
collected  information  about  students.  We  also  described  the 
students  by  name,  Social  Security  number,  service,  and  rank. 
These  characteristics  of  an  entity  are  called  attributes, 
and  can  be  either  quantitative  or  qualitative. 

ft  E§i§t.ions h i£  is  a  logical  link  between  two  entities 
that  can  also  be  described  by  attributes.  \  relationship 
will  fall  into  one  of  three  categories  of  mappings:  one-to^ 
one,  one-to-Eany_  /  many-to-one ,  or  many_- to-many_.  A  one-to- 
one  relationship  exists  when  eaca  entity  or  attribute  is 
logically  linked  to  one  and  only  one  other  entity  or  attri- 
bute. For  instance,  we  say  that  there  is  a  one-to-one  rela- 
tionship between  an  individual's  social  security  number  and 
his  name.  In  a  one-to-many/many-to-one  relationship,  each 
entity  or  attribute  is  logically  linked  to  one  or  more  other 
entities  or  attributes.  An  example  of  this  is  the  relation- 
ship between  the  instructor  of  a  class  and  the  students  in 
that  class.  A  many-to-many  relationship  occurs  when  one  or 
more  entities  or  attributes  is  related  to  one  or  more  other 
entities  or  attributes.  For  example,  there  is  a  many-to- 
maay  relationship  between  the  attributes  "color"  and  "model" 
of  a  type  of  car--each  color  may  De  available  on  many 
different  car  models  and  each  car  model  may  be  available  in 
maay  different  colors. 

In  order  to  understand  the  generic  terms  we  have 
presented  in  their  proper  context,  it  is  important  to 
differentiate  between  the  dictionary  schema  itself,  the 
metadatabase  that  it  governs,  and  the  "real"  data  in  the 
organization's  database.  These  concepts  are  made  even  more 
confusing  because   the  terminology   used  to   refer  to   these 
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thcee  levels  of  data  differs  from  vendor  to  vandor  and  from 
author  to  author.  We  will  look  at  these  levels  using  the 
Applied  Data  Research,  Inc.  DATADICTIONARY  terminology 
[Ref.  4]  because  it  provides  the  clearest  distinction 
between  the  three.  (DATADICTI01HRJT  will  be  discussed  in 
depth  in  Chapter  7.) 

At  the  highest  level   of  abstraction,   entitles,   attri- 
butes, and  relationships  are  grouped  by  type: 


the  dictionary  schema  can  than  be  thojght  of  as 
containing  all  existing  entity-types,  relationsnip- 
types,  and  a ttribute-typest  any  one  of  which  will  also 
be  referred  to  as  a  schema  ^escrL^tor  [Ref.  5]. 


The  schema  descriptors  are  the  general  categories  of  data 
that  is  stored  in  the  metada tabase.  Figure  2. '4  shows  exam- 
ples of  some  standard  schema  descriptors. 


IUtity.-ty.pes    Attribute-types  Relationsii  £;:ty_Des 

File  Author  Contains 

Record  Description  Owns 

Field  Password  Processes 

Module  Status  Derived  Fcdu 

Program  Version  Resides 

Report  Frequency  Uses 

Job  Security  Class  Iacludes 

Dataview  Alias  Authority 

User  Comment  Accesses 

System  Effective  Date 

Process  Dsage  Statistics 


Figure  2- 4   Saaple  Schema  Descriptors 

At  the  metadatabase  level,  we  look  at  specific  instances 
of  schema  descriptors.  Thus,  we  dafine  an  enti t v-occurrence 
as  a  specific  instance  of  the  general  category  entity-type. 
If  PROGRAM  is  the  entity-type,  ACCOUNTS  RECEI7ABLE  could  be 
one  entity-occurrence.   Similarly,  a  relationshi£-occurrence 
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is  a  specific  instance  of  the  general  category  relationship- 
type.  The  relationship- type  ACCESS  may  have  as  a 
relationship-occurrence  PSDSRAM-ACCSSSES-FILE.  At  this 
level,  we  also  talk  about  the  specific  characteristics  of  an 
attribute-type.  An  attribute-type  is  the  name  of  a  charac- 
teristic of  an  entity-occurrence,  as  Social  Security  Number 
characterizes  a  student.  An  attribute-characteristic  is  not 
the  value  of  the  attribute- type,  but  the  parameters  of  an 
attribute-type,  such  as  its  length  and  format.  For  example, 
the  attribute- type  Social  Security  Number  will  be  character- 
ized as  eleven  digits  long,  of  the  forn  999-99-9999. 
Entity-occurrences,  relationship-occurrences,  and  attribute- 
characteristics  will  be  referred  to  as  the  descriptors  of 
the  metadatabase. 

At  the  "real"  data  level  of  the  organization's  database, 
we  think  in  terms  of  actual  values  of  data,  such  as 
"Jennifer  C.  Brown",  "547-23-3410",  "left-landed  monkey 
wrench",  "IBH  3033",  or  "93943".  These  are  all  values  of 
the  attributes  of  an  entity,  and  are  called  attribute- 
values. 

An  example  of  each  of  the  levels  of  data  is  given  in 
Figure  2.5.  Re  will  use  the  generic  terms  entity,  attri- 
bute, and  relationship  in  this  thesis  where  it  is  not  neces- 
sary to  distinguish  between  the  three  levels. 

When  a  data  dictionary  is  receivel  from  tie  vendor,  it 
contains  a  system  standard  schema  which  incinies  certain 
basic  entity-types,  attribute-types,  and  relationship-types 
chosen  by  the  vendor.  A  data  dictionary  is  extensible  if  an 
organization  is  able  to  customize  the  schema  by  defining  its 
own  entity-types,  attribute-types,  ani  relationshi p- types  in 
addition  to  those  included  in  the  system  standard  schema. 
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Schema  Descriptors  ExaigLe 

Entity-type  Record 

Attribute-type  Name 

Relationship-type  Contains 

Metadatabase    Descriptors  Ex§.9lP.k2 

Entity-occurrence  Student 

Attribute-characteristic     25  characters,  alpha- 
numeric 

Relationship- occurrence      Student-; on tains- 
Name 

Database  Data  Ei§12L§ 

Attribute-value  Ronald  P.  Marlcey 


i 


Figure  2„ 5   Comparison  of  Data  Levels 

C.   THE  FUNCTIONS  OF  A  DATA  DICTIONARY 

The  functions  performed  by  a  t/pical  data  dictionary 
faLl  into  four  categories:  definition,  update,  retrieval, 
ana  software  interface.  A  data  dictionary  should  be  evalu- 
ated in  each  category  according  to  the  ease  and  success  with 
which  the  functions  are  performed. 

1  •   Definition 

The  first  step  in  the  implementation  of  a  data 
dictionary  is  to  collect  information  about  soma  portion  of 
an  organization's  data,  such  as  the  U.S.S.  Constellation's 
supply  department.  This  is  done  0/  interviewing  supply 
department  personnel,  identifying  the  data  receive!  and 
produced  by  the  department,  and  analyzing  the  software  that 
manipulates  that  data.  Once  entities,  attributes,  and  rela- 
tionships have  been  defined,  these  data  elements  are  entered 
into  the  data  dictionary  using  the  dictionary's  data  defini- 
tion commands.  The  elements  are  classified  according  to  the 
entity-types,  attribute-types,  and  relationship- types  of  the 


system  standard  schema,  or  the  dictionary  administrator  nay 
use  customized  lata  types  as  necassary,  assuming  the 
dictionary  is  extensible. 

2 .  URdat e 

As  an  organization  evolves,  so  does  its  lata.  One 
of  the  functions  of  the  lata  dictionary  is  to  allow  the 
addition,  modification,  and  delation  of  elements.  For 
instance,  a  new  Navy  regulation  might  require  the  supply 
department  to  keep  track,  of  certain  data  about  a  new  inven- 
tory item  and  to  report  this  data  quarterly.  Dr  perhaps  the 
adninistrative  department  will  have  to  change  zip  codes  to 
the  new  nine-digit  format  on  all  correspondence.  Each  of 
these  changes  will  be  introduced  via  modifications  to  the 
dictionary  schema. 

3.  Retrieval 

Infornation  can  be  retrieved  from  a  data  dictionary 
by  using  query  language  commands  or  the  report-generating 
capability  of  the  dictionary.  \  dictionary  will  provide 
structured  commands  or  an  English-like  query  language  that 
will  help  the  supply  department  to  find  out  tie  Navy  part 
nuaber  for  a  monkey  wrench-  It  will  also  allow  the 
dictionary  administrator  to  find  oat  which  users  have  access 
to  a  particular  subschema.  Reports  are  produced  by  a  data 
dictionary  according  to  a  vendor-defined  format  or  to  user 
specifications.  Reports  generally  produce  a  larger  volume 
response  than  queries  and  are  often  printed  out  in  hard 
copy. 

1-   Software  Interface 

The  software  interface  functioa  provides  a  means  of 
access  to  the  data  dictionary  for  applications  software, 
including   compilers,    editors,    and   database   management 
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systems.  A  33PY  command  is  used  to  Dring  data  descriptions 
(e.g.,  of  records  or  files)  directly  into  the  program  being 
developed  froa  the  data  dictionary.  Thus,  tie  job  of  the 
programmer  is  made  easier  and  data  use  is  standardized.  It 
is  also  possible  for  applications  software  to  directly 
retrieve  and  make  changes  to  tae  elements  in  a  data 
dictionary. 
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III.  FEDERAL  INFORMATION  PROCESSING  STANDARD  FOR  DATA 

DICTIONARY  SYSTEMS 

A.   INTRODUCTION 

The  Institute  for  Computer  Sciences  and  Technology  of 
ths  National  Bureau  of  Standards  is  in  the  process  of  devel- 
oping a  standard  software  specification  for  lata  diction- 
aries. The  Federal  Information  Processing  Standard  for  Data 
Dictionary.  Systems  (FIPS  DDS)  is  intended  to  serve  as  a 
guideline  for  the  evaluation  and  selection  of  lata  diction- 
aries to  be  used  by  the  federal  government.  The  four 
volumes1  "specify  and  describe  the  functionality,  database 
structure,  and  user  interfaces  of  the  FIPS  DDS"  [Ref.  6]. 

We  examined  three  volumes  of  the  FIPS  DDS:  Command 
Language  Interface  Specifications  (volume  2) ,  Interactive 
Interface  Descriptions  (volume  3)  ,  ani  Dictionary 
Administrator  Support  Specifications  (volume  4).  The 
subject  of  each  of  the  volumes  corresponds  to  one  of  the 
three  categories  of  users  who  will  interact  with  a  data 
dictionary — the  experienced  user,  the  relatively  inexperi- 
enced user,  and  the  administrator  of  the  data  dictionary. 

The  FIPS  DDS  describes  in  detail  a  suggested  system 
standard  schema  for  a  data  dictionary,  including  definitions 
and  use  of  the  schema  descriptors.  Each  of  the  volumes 
presents  the  syntax  for  commands  necessary  foe  its  target 
users  to  manipulate  the  dictionary.  In  addition,  the 
results  of  each  command  are  detailed,  with  error  messages 
and  "successful  completion"  messages  listed  where 
applicable. 


*Note:  Volume  1  is  not  yet  available  foe  ceview.  The 
FIPS  DDS  is  in  draft  form  and  has  not  been  formally  approved 
by  the  National  Bureau  of  Standards. 
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B.   SYSTEM  STANDARD  SCHEMA 

The  system  standard  schema  sat  forth  in  the  FIPS  DDS 
provides  basic  entity- types,  attribute- types  and 
relationship-types  as  follows: 

Entity-types 

1.   SYSTE)i--a  collection  of  processes  and  lata 
.  2.   PROGR&M--an  automated  process 

3.  MOD0*LE--an'  automated  process  which  Is  a  logical 
subdivision  of  a  PROGRAM  or  an  independent  process 
called  by  a  PR0GRA3 

4.  FILE — an  organization's  data  collection 

5.  RECORD  —  logically  associated  data  which  belongs  t? 
the  organization 

6-   DOCUMENT — human-readable  data  collections 

7.  ELEMENT — data  belonging  to  the  organization 

8.  OSER--members  or  collections  of  members  belonging  to 
the  organization  ising  the  facilities  available  in 
the  data  dictionary 

9.  DICTIONARY-USER — users  of  tne  dictionary  system 
itself 

10.  ACCESS-CONTROLLER — specifies  access  restrictions  to 
an  entity  or  set  of  entities  in  the  dictionary 

SYSTEM,  PROGRAM,  and  HODOLE  ace  of  the  class  "Process"; 
FILE,  RECORD,  D3CUHENT,  and  ELEMENT  are  of  the  class  "Data"; 
USER  is  classed  as  "External",  and  DICTIDSiRY-OSEK  and 
ACCESS-CONTROLLER  are  of  the  class  "Security". 

Attribute- types 

There  are  55  attribute-types  included  in  the  system 
standard  schema,    similar   to    the   ones    shown    in   Figure    2.4. 

Relationship-types 

The  standard  relationship-types  provided  by  FIPS  are  as 
follows: 
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1.  CONIAINS--describes  entities  compose!  conceptually 
of  other  entities 

2.  PROCESSES--shows  the  relationship  between  a  process 
and  data 

3.  RESP3SSIE-LE-F0R — shows  the  association  between  enti- 
ties representing  organizational  components  and 
entities  denoting  organizational  responsibility 

4.  RUNS--shows  the  relationship  between  a  user  and  a 
process 

5.  TO--shows  the  flow  between  two  processes 

6.  DERIVED-FR3M — shows  that  an  entity  is  the  result  of 
some  operation  on  another  entity 

7.  The  FIPS  DOS  includes  an  extensibility  facility  to 
provide  for  the  customization  of  the  system  standard 
schema  to  match  the  organization's  neeis. 

C-   COBHAND  LANGUAGE  INTERFACE  SPECIFICATIONS 

The  experienced  user  is  one  who  is  familiar  with  the 
structure  and  commands  of  the  data  dictionary  and  who  needs 
access  to  the  full  functionality  of  the  data  dictionary. 
Command  language  commands  are  used  to  facilitate  this  access 
by  allowing  the  user  to: 

--define  lata  elements 

— maintain  the  dictionary  (add/moiif y/deletei 

--report  on  dictionary  elements 

— query  the  dictionary  about  dita  elements 

--build  BQtity  lists  aad  perform  operations  on  groupings 
of  entities  that  meet  certain  criteria  (useful  for  global, 
vice  individual,  operations) 

--support  applications  programs  that  intecict  with  the 
data  dictionary 

— perform  general  utilities,  such  as  changing  the  mode 
of  operation  and  obtaining  help  information. 
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The  syntax  of  each  of  the  command  language  commands  is 
presented  in  the  FIPS  DDS  using  Backus-Nauc  form.2  For 
example,  the  following  command  would  be  used  to  modify  an 
entity  that  already  exists  in  the  dictionary: 

K3DIFY-ENTIT? 

{[WHERE]  NAME  [IS]  <entity-name> 

[ADD  NEW-VERSION  [ <versioD'numb er> ] ] 

WHERE  ATTRIBUTES  [ARE]  Cattri ba t e-clause-1 > 
[,....,  [ <attribute-clause-n> ] ]} 
where: 

--entity-name  refers  to  a  single  entity  in  the 
dictionary 

--NEW-VER5IDN  is  an  optional  clause  which  results  in  the 
creation  of  a  new  entity  which  has  a  primary-name  consisting 
of  the  assigned-name  of  the  entity-name  specified  and  the 
next-highest  version-number 

— attribute-clause-n  refers  to  a  clause  used  to  desig- 
nate the  attributes  of  the  specified  entity  which  are  to  be 
modified 

D.   INTERACTIVE  INTERFACE  SPECIFICATIONS 

The  interactive  interface  for  the  relatively  inexperi- 
enced user  is  designed  to  lead  the  user  step-by-step  through 
the  desired  operations.  Without  having  to  master  the 
command  language  commands,  the  interactive  interface  user 
has  a  large  subset  of  the  total  functionality  available 
within  the  data  dictionary,  including  manipulation , 
reporting,  querying,  and  entity  list  operations.  The  FIPS 
DDS  recommends  that  this  interface  be  implemented  by  means 
of  "panels"  (screens)  that  are  presented  to  the  user  in 
seguence  and  which  contain  the  following  information  areas: 


2Backus-tJauc  form  is  explained  in  \ppendix  \ . 
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1.  state    area — tells   the    ussr    where   (in    which 
dictionary)  he  is  and  what  he  is  doing 

2.  data  area--for  entering  an!  displaying  lata 

3.  schema  area--used   mostly  for  dictionary   updates  to 
show  available  options  and  limitations  on  actions 

U.   message  area — for  error  messages  and  warnings 

5.  action  area--tells  the  usee  how   to  proceed  from  the 
current  panel 

6.  help   area--for   the   display   of   help   information 
requested  by  the  user 

The  user  begins  his  session  with  the  data  dictionary  at 
a  "home  panel"  which  provides  entry  into  the  system.  At  any 
point  along  tie  way  he  has  the  option  of  saving  or  undoing 
any  panel  with  which  he  has  been  working.  This  panel-driven 
interface  ensures  that  the  user  always  knows  where  he  is  in 
the  dictionary,  what  mistakes  he  has  made,  wtiat  choices  he 
has  to  continue,  and  what  help  is  available  to  him. 

E.   DICTIONARY  ADMINISTRATOR  SUPPORT  SPECIFICATIONS 

The  administrator  of  the  data  dictionary,  of  course,  has 
access  to  both  the  standard  command  language  and  the  inter- 
active interface.  His  or  her  main  concern,  however,  is  the 
management  of  the  schema.  This  is  accomplished  by  means  of 
a  specialized  set  of  commands  for 

— extending  the  system  standarl  schema 

--reporting  on  the  schema 

--implementing  access  control  measures 

— controlling  export  fcom  and  import  to  the  dictionary. 

We  have  already  defined  the  extensibility  facility  as 
the  ability  to  add  schema  descriptors  to  the  system  standard 
schema.  The  report  facility  allows  the  administrator  to 
generate  a  listing  of  the  entire  schema  or  any  subset 
thereof.     The   security   facility   provides   commands  for 
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restricting  the  access  of  users  to  the  dictionary  by  speci- 
fying which  commands  the  user  is  allowed  to  execute.  The 
export/import  facility  allows  transfer  of  pirts  of  one 
dictionary  to  another,  but  only  between  dictionaries  whose 
schema  are  identical  in  order  to  preserve  the  integrity  of 
the  "target"  dictionary. 

F.   EVALUATION 

It  is  certainly  true  that  the  FTPS  DOS  presents  the 
reiier  with  vary  detailed  specifications  of  the  commands  and 
facilities  for  a  standardized  data  dictionary;  the  volumes 
we  reviewed  could  serve  as  the  basis  for  an  initial  design 
spacif ication  for  the  development  of  lata  dictionary  soft- 
ware. A  dictionary  based  on  the  FIPS  specifications  would 
perform  the  required  functions  discussed  in  Chapter  II  and 
would  contribute  to  the  organization's  management  of  its 
data.  The  military  and  the  federal  government  would  benefit 
greatly  from  the  availability  of  standard  software  to 
achieve  control  over  its  data  resource. 

The  major  contribution  of  the  FIPS  DDS  is  its  orienta- 
tion to  the  needs  of  the  different  kinds  of  users  of  a  data 
dictionary.  This  is  particularly  evident  in  the  interface 
that  is  suggested  for  use  by  inexperienced  users  of  the  data 
dictionary.  The  panel-driven  foraat  with  its  six  informa- 
tion areas  is  far  less  intimidating  thin  the  syntax  required 
by  the  command  language.  Even  so,  the  interactive  interface 
still  requires  a  certain  degree  of  sophistici tion  on  the 
part  of  the  "inexperienced"  user  if  he  is  to  be  able  to 
manipulate  the  dictionary.  Another  strong  point  of  the  FIPS 
DDS  is  its  consistency  of  presentation  and  format.  No 
matter  what  the  operation,  the  procedures  needed  to  manipu- 
late the  dictionary  and  the  manner  in  which  the  dictionary 
"responds"  to   the  user  are   logical  and   predictable.    The 
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commands,  however,  are  complex  and  reguire  knowledge  of 
Backus-Naur  form. 

Even  though  the  FIPS  DDS  does  indeed  provide  a  compre- 
hensive software  standard  for  the  computer  prof essional ,  we 
do  not  believe  that  it  achieves  its  goal  of  providing  a 
guide  for  the  evaluation  and  selection  of  data  lictionanes. 
Although  the  addition  of  the  introductory  voljme  may  help 
remedy  the  problem,  the  three  volumes  of  specifications 
ignore  the  forest  of  reasons  behind  the  implementation  of  a 
data  dictionary  while  concentrating  soleLy  on  the  patterns 
of  the  leaves  on  each  tree.  The  FIPS  DDS  will  not  be 
extremely  useful  to  the  individual  searching  for  basic 
assistance  in  evaluating  commercial  data  dictionary  pack- 
ages. Many  of  the  books  and  articles  we  have  reviewed 
prDvide  better  explanations  of  data  dictionary  features  and 
comprehensive  evaluation  criteria. 

We  found  that  the  terminology  that  the  FIPS  DDS  uses  for 
the  dictionary  schema  and  the  metadatabase  is  not  explained 
clearly  nor  is  it  any  less  confusing  than  that  of  any  other 
publication.  In  addition,  no  specific  examples  of  how  an 
organization's  data  would  be  entered  in  the  data  dictionary 
are  given.  He  feel  that  it  is  more  important  f3r  the  poten- 
tial data  dictionary  user  to  understand  how  a  data 
dictionary  will  assist  in  the  management  Df  data  than  to  see 
samples  of  every  conceivable  type  of  error  message  that 
could  occur.  k  summary  of  recommended  features  such  as  the 
one  we  have  just  presented  and  a  list  of  criteria  for  evalu- 
ation would  be  far  easier  for  the  reader  to  digest. 

None  of  the  data  dictionary  packages  we  have  reviewed 
does  things  totally  the  "FIPS  way",  and  it  is  jnlikeiy  that 
any  commercial  dictionary  vendor  will  ever  conform  exactly 
to  FIPS  DDS  guidelines.  However,  it  is  likely  that  the 
feieral  government  will  insist  that  FIPS  standards  be 
incorporated  into   future  iictionaries  intendei   for  govern- 
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merit  use.  In  the  next  chapter  we  will  develop  a  set  of 
criteria  for  an  "ideal"  data  dictionary,  talcing  FIPS  DDS 
recommendations  into  account.  In  Chapter  V  we  will  examine 
foar  commercial  data  dictionary  packages  and  evaluate  their 
success  in  meeting  the  ideal  criteria. 
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IV.  THE  ROLE  DF  THE  DATA  DICTIONARI  IH  INFORMATION  RESOURCE 

HiNAGEMENr 

In  this  chapter  we  will  see  how  a  data  dictionary  can 
contribute  to  the  goal  of  efficient  management  of  an  organi- 
zation's data.  We  will  first  disruss  the  process  of  devel- 
opment of  an  information  system  in  an  organization  and  then 
will  discuss  the  three  objectives  of  data  dictionaries  that 
we  have  identified  as  contributing  the  most  to  the  accom- 
plishment of  this  goal:  data  security,  data  integrity,  and 
documentation/maintenance.  He  will  then  develop  a  set  of 
criteria  for  the  "ideal"  data  dictionary  to  b=  used  in  the 
evaluation  of  data  dictionary  packages. 

A.   INF0RMATI3H  RESOURCE  HAHAGEMENr 

Organizations  today  have  become  increasingly  aware  of 
the  need  to  manage  data  just  as  they  manage  other  essential 
resources.  If  properly  managed,  the  necessary  data  will  be 
available,  up-to-date,  and  retrievable  when  required  to 
provide  information  that  is  of  value  to  the  organization. 
This  concept  is  known  as  Information  Resource  lanagement,  or 
IRS,  although  it  might  also  be  referred  to  as  Data  Resource 
Management. 

IRM  has  been  the  focus  of  a  great  deal  of  interest  in 
recent  years.  In  October  of  1983,  the  Institute  for 
Computer  Sciences  and  Technology  of  the  National  Bureau  of 
Standards  (BBS)  and  the  association  for  Computing  Machinery 
(A3M)  co-sponsored  a  workshop  on  IRM  strategies  and  tools. 
It  was  based  on  the  premise  that 

IRM  is  currently  one  of  the  most  significant  topics 
being  discussed  concerning  infornation  systems,  ana  ls 
oeing  discussed   along  a   variety  of   lines  of   thought. 
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These  include  business  systems  planning;  information 
systems  anaLysis,  design,  and  development;  database 
design  and  implementation;  the  disciplines  of  office 
management,  paperwork  management,  and  Information 
sciences  management;  and  the  various  problems  and  costs 
associated  with  implementing  ISM  to  inclule  each  of 
these  areas.   [Ref.  7] 


The  Proceedings  of  the  workshop  defined  IRM  as 

whatever  policy,  action*  or  procedure  concerning  infor- 
mation (both  automated  and  non-automated  supported) 
which  management  establishes  to  serve  tie  overall 
current  and  future  needs  of  the  enterprise.  Such  poli- 
cies, etc.,  would  include  consiiera tions  of  avail- 
ability, timelinessf  accuracy,  integrity,  privacy, 
security,  auditability,  ownership,  use,  and  cost  effec- 
tiveness.  [ Ref.  8  ] 

The  recommendations  of  the  NBS/ACM  workshop  on  the  role  that 
the  data  dictionary  should  play  in  IRH  were  incorporated 
into  the  Federal  Information  Processing,  Standard  for  Data 
Dictionary.  Systems  that  we  discussed  in  Chapter  III. 

In  order  to  understand  how  the  data  dictionary  contrib- 
utes to  the  production  of  valuable  information  for  an  orga- 
nization, we  will  look  more  closely  at  the  organization 
itself  and  at  its  functions.  An  organization  is  made  up  of 
many  systems  that  convert  resources  into  usable  output.  An 
information  system,  then,  is  one  that  takes  raw  data  and 
trinsforms  it  into  information  that  can  be  used  by  the  orga- 
nization. If  the  process  by  which  the  organization  develops 
its  information  systems  is  the  heart  of  information  resource 
management,  then  it  is  the  data  dictionary  taat  keeps  it 
ticking. 

Assume  that  the  U.S.S.  Constellation  has  identified  a 
problem  with  the  way  a  particular  information  system  is 
currently  operating — it  could  be  .  preventive  maintenance 
record- keeping,  the  supply  deparmtnewt  inventory,  the 
personnel  administration  system,  or  a  system  that  affects 
the   entire  organization.     The   process   of  analyzing   the 
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system  and  developing  a  system  to  solve  this  problem  evolves 
through  four  distinct  phases,  called  the  Sy_stem  Development 
Life  Cy,cle  (SDLC)  .  He  will  sho*  ho»  the  data  dictionary 
supports  the  3DLCr  and  thus,  IRM,  through  plaaaing,  study, 
design/coding,  and  operation  and  maintenance.  /i e  have  based 
our  analysis  of  the  SDLZ  on  that  of  Leong-Honj  and  Piagraan 
[Baf.  9]. 

1  •   PiiS.ClLQ.il  Phase 

The  Proceedings  of  the  NB3/aca  workshop  emphasized 
the  need  for  a  "top-down"  approach  to  IRM  in  an  organiza- 
tion. During  the  planning  phase,  the  organization^  long- 
raage  plans,  its  functions,  and  structure  are  analyzed  to 
ensure  that  any  information  system  that  is  developed  will 
complement  those  needs. 

If  a  lata  dictionary  is  alceady  in  existence,  it  can 
provide  information  about  the  functions  of  th2  organization 
that  have  beea  defined,  or  it  can  documeat  the  initial  defi- 
nition of  those  functions.  For  eaca  function,  it  must  be 
determined  who  does  it,  what  is  produced,  what  other  func- 
tions it  interacts  with,  and  what  inputs  are  naaied  to  acom- 
plish  the  function.  As  an  example,  we  can  say  of  the 
Payroll  function  that  it  is  performed  by  the  disbursing 
office,  paychecks  and  leave  and  earnings  statements  are 
produced,  it  interacts  with  the  personnel  li ministration 
system,  and  it  requires  data  about  all  members  of  the  crew, 
including  rank/rate,  time  in  service,  ind  so  on. 

At  this  stage  of  the  development  process,  the  "big 
picture"  is  drawn  while  the  details  are  left  until  later. 
Thas,  general  categories  of  data  such  as  "accounting  data" 
and  "personnel  data"  and  the  transactions  that  affect  them 
are  defined  and  entered  in  the  dictionary. 

In  the  aggregate,  this  planning  information  consti- 
tutes a  conceptual  data  model.    "Definition  and  analysis  of 
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subsequent  information  requirements  (and  eventually,  data- 
base design)  will  be  dependent  upon  this  data  model" 
[Ref.  10].  The  fact  that  the  devalopment  of  tais  model  has 
been  automated,  rather  than  manual,  ensures  a  quicker,  stan- 
dardized process. 

2.   Study,  Phase 

At  this  point  in  the  SDLC,  a  greater  level  of  detail 
is  introduced.  The  data  dictionary  provides  a  common,  stan- 
dardized source  of  information  about  the  inputs  and  outputs 
of  the  organization's  functions.  Specific  entities,  attri- 
butes, and  relationships  are  chosen  from  the  general  catego- 
ries of  data  identified  in  the  planning  phase.  The  entity 
PART  in  the  Constellation's  inventory  system  may  be 
described  by  the  attributes  Navy  Part  Number,  Description, 
Storage  Location,  and  Quantity.  There  may  also  be  a  many- 
to-many  relationship  assigned  between  PART  and  DEPARTMENT. 
Reports  required  to  be  produced  are  also  defined  and  the 
necessary  input  data  is  identified. 

This  information  provides  what  is  called  a  detailed 
£once£tual  model,  an  expansion  of  the  conceptual  model  of 
the  planning  phase.  The  data  dictionary  can  be  used  to 
identify  redundancy  within  the  data  model  by  determining 
whether  the  data  entered  already  exists.  In  addition,  with 
the  aid  of  the  dictionary,  the  systems  analyst  will  be 

able  to  determine  what  data  is  available,  how  it  is 
being  used,  how  it  can  be  accassei,  who  aas  primary 
responsibility  for  its  definition  and  upkeep,  and  most 
Important,  wfrther  there  is  conflict  in  using  this  data, 
that  is,  what  impact  it  will  have  on  other  application 
systems  [ Ref.  11 ]. 
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3-   2§§i2!l/i2^i.I12  Eiiase 

The  pirpose  of  the  design  phase  is  to  provide  speci- 
fications for  programming  and  implementing  the  system.  It 
is  here  that  the  data  dictionary's  scheaa  descriptors  will 
be  used  or  expanded  to  meet  the  needs  of  the  system.  If  a 
database  does  not  already  exist,  and  it  is  determined  that 
one  is  required,  the  data  dictionary  schema  will  provide  a 
basis  from  which  to  implement  one.  Data  Integrity  is 
enforced  because  the  dictionary  serves  as  the  sole  source  of 
data  definition  and  structure. 

when  software  is  being  coded,  the  data  dictionary 
provides  documentation  for  the  programmer  and  a  COPY 
facility  for  transporting  record  definitions,  for  example, 
into  the  program  being  developed.  An  important  element  of 
the  dictionary  is  the  constraints  that  are  defined  for  data 
values.  In  this  way,  data  that  is  input  to  a  program  can  be 
checked  against  the  constraints  that  have  been  established. 
Documentation  of  the  program  includes  the  author,  a  descrip- 
tion, input  requirements,  output  produced,  and  information 
on  what  other  programs  are  called  upon,  all  of  which  are 
incorporated  into  the  data  dictionary. 

**•   Operation  and  Maintenance 

After  a  new  system  has  been  inplementei,  the  work  of 
the  data  dictionary  does  not  end.  All  of  the  documentation 
that  has  been  recorded  during  the  development  of  the  system 
serves  as  a  base  of  reference  for  the  users  of  the  system. 
In  addition  to  the  database  administrator  and  the  adminis- 
trator of  the  dictionary,  the  key  players  La  information 
resource  management  who  benefit  from  the  use  of  a  data 
dictionary  fall  into  six  groups,  according  to  Mien,  Loomis, 
and  Mannino:   [Ref.  12] 
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1.  The  data,  administrator,  who  is  responsible  for  the 
overall  administration  of  the  data  resource,  uses 
the  dictionary  as  a  tool  to  enforce  the  way  data  is 
stored,  maintained,  and  monitored. 

2.  Data  processing  managers  Denefit  from  the  diction- 
ary's reports  on  data  usage. 

3.  Operations  personnel  retrieve  information  from  the 
dictionary  about  jobs  that  are  being  run. 

4.  Programmers  and  analysts  use  the  dictionary  to 
retrieve  data  definitions  and  to  document  a  system 
being  developed. 

5.  End  users  access  the  dati  dictionary  for  descrip- 
tions of  their  dataviews. 

6.  Finally,  auditors  will  use  the  i ocumentation 
provided  by  the  data  dictionary  to  trace  data  and 
programs  as  they  are  used  in  the  computer  system. 

It  is  the  process  of  implementing  a  data  dictionary  that 
we  have  just  descrited--the  analysis  of  the  organization, 
the  definition  of  its  functions,  and  the  documentation  of 
its  information  systems — that  makes  the  dictionary  so  impor- 
tant in  information  resource  management.  He  hive  seen  that 
during  the  development  of  an  information  system,  the  data 
dictionary  is  involved  from  the  initial  plianing  stage, 
through  the  programming  process,  through  the  operation,  and 
into  the  maintenance  of  the  system.  The  dictionary  provides 
the  standards  for  data  which  will  be  used  throughout  the 
life  of  the  system  and  referenced  when  developing  other 
systems.  Key  contributions  include  decreasing  the  amount  of 
redundancy  of  data  required  to  be  stored,  enforcing  security 
of  the  valuable  data  resource  through  access  controls  and 
implementation  of  user  views,  and  providing  documentation 
which  serves  as  a  "corporate  history"  and  as  a  reference 
upon  which  maintenance  and  auditing  are  based.  These  objec- 
tives of  data  dictionary  usage  are  discussed  in  detail  in 
the  next  section. 
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B.   OBJECTIVES  OF  A  DATA  DICTIONARf 

In  this  section  we  will  focus  on  the  three  major  contri- 
butions of  the  data  dictionary  to  the  management  of  an  orga- 
nization's data.  These  ace  data  security,  data  integrity, 
and  documentation/accountability.  Although  we  recognize 
that  other  objectives  of  data  dictionary  usage  might  be 
identified,  wa  believe  that  each  will  fall  into  one  of  these 
three  major  groupings. 

1 •   Data  Security 

There  are  two  distinct  levels  of  security  of  the 
data  in  an  organization's  database  which  will  be  provided 
either  by  the  data  dictionary  or  by  the  database  management 
system  itself.  First,  procedures  should  exist  to  ensure 
that  only  authorized  personnel  are  allowed  to  access  the 
information  contained  within  the  database.  The  widespread 
use  of  computers  and  the  increasing  sophistication  of  users 
has  made  an  organization's  data  vulnerable  to  embezzlers, 
amateur  "hackers",  corporate  spies,  and  careless  employees. 
Se-ond,  the  system  should  contain  provisions  for  controlling 
the  amount  and  types- of  data  that  each  authorized  user  is 
allowed  to  access  within  the  system.  Some  of  the  sophisti- 
cated data  dictionaries,  for  exampLe,  include  a  trace  mecha- 
nism which  increases  security  by  recording  every  inquiry 
that  is  made  into  system  files  and  data.  If  an  intrusion  is 
made  into  tie  system  by  unauthorized  personnel,  the 
specifics  of  that  inquiry,  including  the  data  which  was 
accessed,  will  be  recorded. 

Metadata  should  be  afforded  at  least  the  same 
protection,  if  not  more,  than  the  data  in  the  database. 
Leong-Hong  and  Plagman  [Ref.  13]  present  an  example  of  the 
importance  of  the  security  of  metadata  as  it  concerns 
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the  data  resources  in  intelligence/military  applications 
such  as  the  classification  code  of  intelligence  docu- 
ments. When  security  profiles  far  the  metadata  entities 
are  stored  in  the  metadatabase,  unauthorized  access  to 
the  metadata  could  be  most  damaging.  This  is  because 
oresumably  one  would  be  able  to  'crash  into'  the  system 
using  that  information. 


It  is  the  task  of  the  dictionary  administrator  to  analyze 
tha  metadata  to  determine  the  levels  of  security  required 
ana  to  grant  access  privileges  (read  and  write,  read  only, 
upiate)  to  users  for  certain  portions  of  tie  metadata. 
Information  about  users,  their  password,  and  privileges  is 
stored  in  the  data  dictionary  and  is  accessible  only  to 
personnel  authorized  by  the  administrator. 

We  have  already  shown  in  Figure  2.3  that  subschemas 
contribute  to  security  by  limiting  the  size  of  the  "window" 
through  which  a  database  user  looks  at  data.  When  a  user 
attempts  to  access  a  particular  sibschema,  the  request  is 
routed  through  the  data  dictionary  to  determine  whether 
access  is  authorized  and,  if  so,  the  structure  of  the 
subschema.  Duly  at  this  point  is  the  "real"  data  in  the 
database  accessed. 

2-   Dal§L  Integrity. 

The  keys  to  data  integrity  are  the  control  of  inputs 
to  the  database  and  the  minimization  of  data  duplication. 
Properly  used,  these  keys  will  enhanca  communication  between 
users  by  ensuring  that  a  single,  correct  source  of  data  is 
maintained. 

Because  the  data  in  a  database  is  shared  among  many 
users,  it  is  essential  to  have  some  means  of  enforcing  stan- 
dards for  entering  data,  updating  it,  and  maintaining  it. 
For  example,  the  data  dictionary  identifies  constraints,  or 
limitations  on  the  values  data  can  have.  Fields  can  be 
defined  as  baing  mandatory  or  optional,  alpnanumeric  or 
numeric,    and   a   minimum  or   maximum   length.     The   data 
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dictionary  contains  comments  on  how  data  should  be  used  in 
order  to  assist  those  using  the  dita  dictionary.  Another 
important  contcol  feature  of  a  data  dictionary  is  how  it 
deals  with  synonyms--an  entity  or  attribute  with  more  than 
one  name.  For  instance,  the  entities  EMPLOYEE, 
RE3I0NAL_aANA3EBr  and  EXECUTIVE  may  all  be  usei  by  different 
departments  in  the  organization  to  refer  to  Linda  Smith. 
The  administrator  must  standardize  the  terminology  used  in 
the  organization  and  eliminate  as  nan/  synonyms  as  possible. 
When  this  is  not  feasible,  all  of  these  synonyms,  or 
aliases,  must  be  recorded  in  the  dita  dictionary.  Van  Duyn 
[Ref.  14]  explains  that 

It  is  not  unusual  to  have  similar  types  of  data  elements 
in  the  database  and  in  various  applications.  In  such 
cases,  and  in  cases  where  the  same  data  type  Is  known  by 
other  names,  the  DDS  'data  dictionary]  can  be  used  to 
inform  the  users  of  the  relationships  that  exist  among 
these  data  and  of  the  disposition  or  their  usage.  In 
other  words,  the  DDS  provides  information  as  to  which 
modules/programs  and  systems  use  the  same  data  type  and 
how  they  relate. 

The  data  dictionary  also  contributes  to  data  integ- 
rity because  it  reduces  the  necessity  for  duplication  of 
data  and  therefore  lessens  the  opportunities  for  error.  The 
information  about  the  components  of  different  subschemas  of 
the  same  logical  view  is  stored  in  the  data  dictionary  in 
place  of  the  data  itself.  A  user,  whether  writing  a  program 
or  creating  a  new  entity-type,  should  be  able  to  query  the 
data  dictionary  to  ensure  that  the  necessary  routines  or 
entities  do  not  already  exist  within  the  systea.   Perhaps 


one  of  the  most  important  benefits  of  DDS  [data  diction- 
aries] is  that  because  it  gives  accurate  and  timely 
information,  management  can  control  more  efficiently  not 
only  the  automated  and  manual  data  of  the  enterprise  but 
all  its  resources  and  operations.  Consequently,  manage- 
ment is  provided  with  precise  and  accurate  data  for 
guick,  profitable  decision-making  [Ref.  15]. 
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Thus,  the  possibility  of  two  users  querying  tha  database  and 
receiving  different  answers  to  the  sane  question  at  the  same 
time  is  decreased. 

3.   Documentation/Maintenance 


Eecause  maintenance  is  the  most  expensive  and  time- 
consuming  pbase  of  software  development,  documentation  and 
maintenance  of  the  organization's  lata  is  probably  the  most 
significant  objective  of  the  data  dictionary.  It  is  a  fact 
of  software  life  that  documentation  is  often  avoided  during 
system  development  and  program  design.  To  a  Large  extent, 
this  is  because  documentation  can  be  prepared  as  an  "after- 
thought"; it  is  not  essential  to  the  operation  of  the 
system.  But  when  a  system  is  developed  that  includes  a  data 
dictionary  from  the  beginning,  the  data  which  is  required  by 
the  data  dictionary  forces  documentation  to  become  an  inte- 
gral part  of  the  design.  "The  use  of  a  dictionary  provides 
documentation  of  a  quality  and  form  that  is  simply  not 
available  through  less  formalizel  procedures  in  the  data 
processing  environment"  [Raf.  16]. 

The  data  dictionary  can  aLso  reduce  the  amount  of 
effort  required  by  maintenance  personnel  because  it  provides 
"a  'roadmap'  for  the  programmer  doing  maintenance.  It 
records  the  programs  being  maintained,  their  data  structures 
and  their  relationships"  [ Bef .  17].  We  have  defined  an 
active  data  dictionary  as  one  in  which  information  is 
created,  accessed,  or  modified  through  tha  data  dictionary 
interfaces  with  naw  or  changed  metadata  automatically  stored 
in  the  data  dictionary.  Phis  "continuous  maintenance"  can 
be  used  to  allow  the  database  administrator  to  monitor  where 
data  is  used,  who  uses  it,  how  often  it  is  used,  and  what 
changes  have  been  made  to  it.  Because  the  data  dictionary 
provides  a  wealth  of  documentation,  it  is  possible  to  trace 
an  "audit  trail"  through  the  organization's  data,   from  user 


40 


names  and  department   to  th9  kind  of  lata  used   in  a  program 
to  how  many  records  a  certain  field  appears  in.   Also, 

The  tracking  of  how  programs/modules  use  particular  data 
as  well  as  which  files/segments  contain  certain  data  is 
extremely  important  to  the  systems  analyst  in  performinq 
system  changes.  Through  the  DDS  [lata  dictionary],  he 
or  she  is  able  to  ascertain  what  impact  the  proposed 
changes  will  have  on  other  components  of  the  system  and 
upon  functional  areas  within  the  enterprise.  By  having 
an  accurate,  up-to-date  assessment  of  the  Location  and 
usage  of  lata  that  will  be  involved  in  the  system 
change,  the  analyst  can  accomplish  the  task  more 
efficiently  [Ref.  18]. 


Once  an  organization  has  decided  to  make  a  commit- 
ment to  manage  its  data  using  a  data  dictionary,  it  must 
decide  what  kind  of  data  dictionary  best  suits  its  partic- 
ular needs.  In  the  next  section,  we  will  look  at  the 
features  of  what  we  have  called  the  "ideal  data  dictionary" 
as  a  basis  for  evaluating  the  many  commercially  available 
data  dictionary  packages  from  which  the  organization  must 
choose. 

C.   THE  IDEAL  DATA  DICTIONAB7. 

Having  identified  the  functions  of  a  data  dictionary  in 
Chapter  II  and  how  they  support  the  accomplishment  of  the 
objectives  just  discussed,  it  will  be  helpful  to  use  these 
concepts  to  evaluate  data  dictionaries.  The  "ideal"  data 
dictionary  would  be  one  that  possesses  all  the  capabilities 
necessary  to  support  all  potential  users  in  all  possible 
applications.  However,  this  ideal  dictionary  would  be 
impossible  to  conceptualize,  much  less  to  create.  The  ideal 
data  dictionary  for  an  organization  will  depend  on  the  orga- 
nization's size,  functions,  and  needs.  The  potential  users 
of  a  dictionary  will  have  to  develop  a  set  of  criteria  upon 
which  a  candidate  will  be  judged. 
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Many  references  provide  criteria  for  evaluating  data 
dictionaries  and  identify  those  characteristics  which  are 
vital  to  the  management  of  the  data  resource. 
Unfortunately,  it  is  difficult  to  find  two  references  that 
propose  the  sime  criteria.  One  excellent  source,  Leong-Hoag 
and  Plagman  [Bef.  1 9 ],  lists  nine  categories  f3c  evaluation: 

1 .  data  description  facility 

2.  data  documentation  support 

3.  metadata  generation 

4.  security  support 

5.  integrity  support 

6.  user  interface 

7.  ease  of  use 

8.  resource  utilization 

9.  vendor  support 

It  is  important  to  recognize  a  distinction  between  two 
categories  of  criteria  for  the  ideal  data  dictionary:  those 
that  evaluate  the  vendor  and  operating  environment,  and 
thDse  that  evaluate  the  data  dictionary  itself.  In  the 
former  category,  items  like  vendor  support  and  reliability, 
the  choice  between  free-standing  or  DBMS-dependent  data 
dictionaries,  the  degree  of  integration  with  other  system 
conponents,  and  the  quality  of  system  documentation  are 
important  considerations  that  may  drive  the  decision  between 
twD  comparable  data  dictionaries.  It  is,  aowever,  the 
latter  type  of  criterion  that  wilL  be  vital  in  identifica- 
tion of  the  essential  requirements  of  the  ideal  data 
dictionary.  We  have  grouped  all  such  requirements  into  six 
categories:  system  standard  schema  and  extensibility, 
command  and  query  languages,  ease  of  use  (including  menus), 
security,  documentation  and  reports,  and  application  inter- 
faces. (We  aave  assumed  that  the  objective  of  data  integ- 
rity will  be  accomplished  by  the  correct,  and  enforced,  use 
of  any  data  dictionary.)    If   a  particular  dictionary  fully 
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supports  each  of  these  six  criteria  than  it  wilL  most  likely 
meat  all  of  the  organization's  data  management  needs. 

1 •   System  Standard  Schema  and  Extansibi li ty 

The  ideal  data  dictionary  mast  provide  a  system 
standard  schema  with  all  the  descriptors  necessary  to 
support  the  range  of  applications  ceguiced  by  the  organiza- 
tion while  still  being  simple  enough  to  be  competitively 
priced.  It  must  provide  "enough"  descriptors  to  be  fully 
capable  without  providing  so  many  that  the  schema  becomes 
confusing.  additionally,  the  ideal  dictionary  must  support 
the  user  (oc  data  dictionary  administrator)  in  modifying 
existing  schema  descriptors  and  creating  new  entities,  rela- 
tionships, aQd  attributes.  This  extensibility  is  vital  in 
supporting  applications  specific  to  the  organization's 
needs. 

2.   Command  and  Query  Lang.ua qes 

The  ileal  dictionary  must  provide  both  command  and 
query  languages.  The  command  language  must  support  creation 
and  modification  of  data  structures  and  subseguent  entry  of 
data  into  those  structures.  Ihe  command  Language  must 
include  edit  commands  to  facilitate  addition,  modification, 
and  deletion  of  system  data.  It  should  include  commands 
restricted  to  use  by  the  data  dictionary  administrator, 
e.g.,  password  assignment.  The  ideal  system  will  include  a 
query  language  to  support  the  analysis  and  production  of 
usible  information  from  the  organization's  data.  Perhaps 
one  of  the  most  important  features  of  i  data  dictionary  (and 
database),  guery  languages  allow  data  to  be  screened  in 
order  to  provide  concise  and  specific  information  to  support 
timely  management  decisions. 
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3.  Ease  of  Us  a 

Ease  of  use,  or  user-friendliness,  is  another  impor- 
tant aspect  of  the  ideal  data  dictionary.  It  must  be 
supportive  of  new  users  while  still  providing  fall  func- 
tional support  of  the  system  "experts".  Two  primary  ingre- 
dients of  user-f riendliness  are  the  availability  of  menus 
and  carefully  conceived  examples  in  the  dictionary's  refer- 
ence manuals.  A  hierarchy  of  menus  can  reduce  complex  oper- 
ations to  a  series  of  smaller,  friendlier  steps  while  user 
documentation  provides  easy-to-understand  examples  that 
guide  the  inexperienced  user  through  each  phase  of  system 
operation.  As  microcomputers  and  the  concept  of  the  auto- 
mated office  continue  to  spread,  ease  of  use  will  become  an 
even  more  important  consideration  in  deciding  which  software 
products  to  utilize. 

4.  Security 

Security  will  be  a  vital  concern  of  the  ideal  data 
dictionary.  Protection  and  control  of  system  information 
must  be  provided.  The  data  dictionary  administrator  must  be 
provided  the  capability  to  control  personneL  access  to 
system  data.  He  or  she  must  also  be  able  to  grant  different 
degrees  of  access  to  different  users.  Similarly,  users 
should  have  the  capabilities  to  protect,  and  grant  access 
to,  those  structures  and  data  which  they  controL. 

5.  Documentation  and  Sep_orts 

The  documentation  and  repjrts  created  by  the  ideal 
data  dictionary  must  also  be  clear  and  understandable. 
Timely  and  accurate  preparation  of  reports  is  a  key  objec- 
tive of  any  DBMS.  The  data  dictionary  is  uniquely  qualified 
to  assist  with  this  function.  By  ensuring  the  integrity  of 
data  accessed  and  supporting  guery  commands,   the  ideal  data 
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dictionary  can   provide  reports  ani  documentation   to  answer 
specific  questions  as  they  arise. 

6-   Ap_p_lica tion  interfaces 

The  final  important  characteristic  of  tie  ideal  data 
dictionary  is  its  ability  to  interface  with  the  other  appli- 
cations that  may  exist  in  the  organization.  If  the  data 
dictionary  is  free-standing,  it  should  interface  with  many 
of  the  currently  available  database  management  systems.  If 
DBIS-dependent,  the  dictionary  saould  interface  with  all 
components  of  that  system.  Additionally,  the  ideal  data 
dictionary  should  interface  with  code  generators,  communica- 
tion systems,  and  other  agents  of  the  users1  environment. 

In  the  following  chapter,  *e  will  study  and  evaluate 
four  of  the  popular  data  dictionaries  that  are  currently 
available.  We  will  use  these  characteristics  of  the  ideal 
data  dictionary  that  we  have  defined  to  compare  and  contrast 
the  features  of  the  four  dictionaries.  In  addition,  each 
will  be  compared  to  "standard"  dictionary  presented  in  the 
FIPS  DDS. 
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V.  EVALUATION  OF  COMMERCIAL  DATA  DICTIONARIES 

The  purpose  of  this  chapter  is  to  review  and  evaluate  a 
crDss-section  of  commercial  data  dictionary  packages.  We 
selected  four  dictionaries:  Dklk  DESIGNER,  DATAMANAGER, 
ORACLE,  and  DATADICTIONARY.  User  documentation  and  library 
sources  were  the  primary  sources  of  information  for  our 
evaluation.  Additionally,  ORACLE  was  available  on  the  Naval 
Postgraduate  School's  Vax  minicomputer,  and  we  observed 
demonstrations  of  DATA  DESIGNER  and  DAT ADICIION ARY. 

A.   DATA  DESIGNER 

DATA  DESIGNER  is  a  free-standing  data  dictionary  devel- 
oped by  Database  Design,  Inc.  tt  was  introiuced  in  1975 
with  the  goal  of  supporting  logical  database  design  by 
soLving  some  of  the  traditional  problems  associated  with 
muliple-application  database  management  systems,  such  as 
duplication  of  data,  excessive  storage  requirements,  data 
consistency,  complexity,  and  modif iability.  DATA  DESIGNER 
can  be  used  in  conjunction  with  a  variety  of  database 
management  systems,  including  IMS,  IDMS,  ADABA5,  NOMAD,  and 
others.  Additionally,  it  can  produce  designs  that  will 
interface  witti  COBOL  and  other  non-DBMS  tools  or   systems. 

DATA  DESIGNER  can  be  characterized  as  an 


automated,  easy-to-use  tool  that  assists  the  database 
designer  in  formulating  normalized  views  of  the  data 
requirements  and  synthesizes  these  views  int3  a  canon- 
ical normalized  form.  .  .  .  DATA  DESIGNER  maintains 
information  needed  to  physically  structure  taa  database 
for  efficient  performance  [Ref.  20]. 


In  addition   to  providing  the   standard  functions  of   a  data 
dictionary,   DATA   DESIGNER  goes  several  steps   beyond.    It 
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provides  an  extensive  set  of  commands  categorized  as  user 
commands,  edit  commands,  and  plotting  commands,  as  shown  in 
TaDle  1.  It  also  supports  limited  pcoduotion  of  models  and 
graphics.  Furthermore,  DATA  DESIGNER'S  capabilities  include 
powerful  generation  options  and  report  features  that  will 
support  the  design  and  maintenance  of  applications. 


TABLE  1 
Standard  Commands  of  DATA  DESIGNER 


ADD 

COPY 

END 

HELP 

PRINT 

SHOW  OPTIONS 


DELETE 
LIST 


DRAW 
SET  ALT 

SET  TITLE 


Oser  Commands 


BATCH 

CREATE 

FILES 

HIERARCHY 

RENAME 

TRANSFER 


Edit  Commands 


EDIT 
RENUMBER 


Plotting  Commands 


DDNE 

SET  DEVICE 

SET  TYPE 


BUILD 

EMPTI 

GENERATE 

PLOT 

REPORT 

VALIDATE 


INSERT 

replaze 


RETURS 
SET  RANGE 
SHOW 


DATA  DESIGNER  supports  logical  database  design  through  a 
fire-step  process: 

1.  A  data   dictionary  file  is   created  that   contains  a 
list  of  all  standard  data  item  names  to  be  used. 

2.  Subschema  files  are  created  that  describe  all  of  the 
views  necessary  to  support  user  data  reguirements . 


47 


3.  The  encoded  user  views  ace  Validated.  This  step 
verifies  the  syntax  of  ea-h  view  and  ensures  that 
each  data  item  nans  listed  in  a  view  is  in  the  data 
dictionary. 

4.  All  of  the  verified  user  yie^s  are  synthesized  into 
a  logical  data  model.  Reports  and  diagrams  are 
generated  to  reflect  this  model. 

5.  The  model  is  evaluated  to  ensure  that  it  meets  all 
user  requirements  and  is  modified  as  necessary  by 
repeating  steps  (1)  througa  (4»  . 

DATA  DESIGNER  utilizes  three  kinds  of  files:  dictionary 
files,  subschema  files,  and  generated  design  files.  A 
dictionary  file  ($DIC)  contains  a  list  of  all  data  elements 
that  will  be  used  in  an  application  or  subschema.  This  list 
serves  as  a  base  for  further  development,  e.g.,  additional 
views.  A  subschema  file  ($SUB)  contains  data  items  and 
relationships  pertaining  to  particular  views.  Finally,  the 
generated  design  file  ($DE5)  contains  a  logical  data  model 
geaerated  by  D&T&  DESIGNER  using  the  applicable  dictionary 
and  subschema  files  as  input.  The  generated  design  files, 
in  turn,  serve  as  the  input  for  the  report  and  graphics 
functions. 

Key  commands  utilized  during  the  creation  of  a  logical 
database  design  include  the  following: 

CREATE--ief ines  dictionary  and  subschema  files. 

BUILD--enters  data  item  names  into  created  files. 

VALIDATE--compares  the  subschema  files  to  the 
dictionary  file. 

GENERATE--creates  a  logical  DB  design  from  the 
validated  files. 

REPORT — prepares  design  documentation  for  the 
logical  design. 

PLOT — uses  the  plotting  subsystem  to  draw  the 
logical  design. 

EDIT — supports  modification  of  existing  files  when 
necessary. 
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In  order  to  acquaint  the  reader  with  the  operation  of 
DATA  DESIGNER,  we  will  lemonstrate  the  lialog  associated 
with  each  step  of  the  process  necessary  to  create  our  Naval 
Postgraduate  School  database  example  of  Chapter  II.  The 
user  of  DATA  DESIGNEE  must  first  create  the  dictionary  file 
STUDENT. DIC  and  the  subschema  file  STUDENT. SUB  (user  inputs 
are  indicated  by  boldface  type)  as  follows: 

>CBEATE  STUDENT. DIC  DICTIONARY 
DDFC0101I  File  " STUDENT. DIC"  of  type  "$DIC"  created. 

>CREATE  STUDENT. SOB  SUBSCHEMA 
DDFC0203I  File  "STUDENT. SUB"  of  type  "$SUB"  created. 

Next,  the  BUILD  command  is  used  to  load  lata  items  into  the 
two  created  files.  First  all  possible  data  iteas  are  listed 
in  the  dictionary  file: 

>BUILD  STUDENT. DIC 
DDBS0065I  The  file  type  is  $DIC. 
DDBS0018I  There  are  no  records  in  the  file. 

B>NAHE 

B>SSN 

B>SERVICE 

B>RAHK 

B>DONE 
DDBS006UI  File  building  is  done. 
DDBS0068I  4  records  were  entered 
DDRN0098I  Line  1100  is  now  the  last  line  in  your  file. 

The  subschema  file  will  support  creation  of  one  or  more  user 
views.  In  our  example,  the  suoschema  file  contains  two 
views,  the  basic,  overall  view  and  the  view  intended  for 
Any  use  only.  Notice  that  after  the  user  enters  the  BUILD 
process,  each  line  must  start  with  a  modeling  code.  These 
codes  are  usel  to  identify  components  and  to  establish  rela- 
tionships within  the  views.  When  building  the  subschema 
files,  all  desired  relationships  must  be  specifically 
stated.  DATA  DESIGNER  uses  "1"  to  specify  a  one-to-one 
relationship  and  "M"  for  a  one-to-many  relationship.  A 
complete  list  of  the  modeling  coles  used  in  this  example 
appears  in  Table  2. 
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>BUILD  STUDENT. SOB 
DDBS0075I  The  file  type  is  $SU3. 
DDBS0081I  There  are  dd  records  in  the  tile. 

B>V,ST0-1 

****************************************** 

*  THIS  7IEW  SOPPOBTS  THE  OVERALL  VIEW    * 
****************************************** 

B>F,0100 

B>T#0003 

B>K,SSN 

B>1,NAHE 

B>1, SERVICE 

B>1,RANK 

B>V,STO-ARHY 

****************************************** 

*  THIS    VIEW    SUPPORTS    THE    ARM    VERSION  * 

B>F,0125 

B>T,0002 

B>K,SSN 

B>1,NAME 

B>1.RANK 

*************************************  **  **  fc 

B>DOHE 
DDBS006UI  File  building  is  dona. 
DDBS0068I  13  records  were  entered 


TABLE  2 
DATA  DESIGNER  ModeLing  Codes 

Code         Modeling  Ose 


V  Name  a  user  view 

F  Specify  frequency  of  use 

T  Specify  reg*d  response  tima 

K  Name  a  key 

C  Concatenate  keys  and  data 

S  Concatenate  keys  in  short  way 

L  Label  a  data  group 

M  Identify  a  multiple  association 

1  Identify  a  single  association 

N  Name  an  association 

*  Insert  comments 


Once  the  dictionary  and  subschema  files  are  formatted, 
the  VALIDATE  command  is  used  to  ensure  that  alL  entries  and 
relationships  in  the  subschema  files  are  valid  based  on  the 
information   previously  specified   in   the  dictionary   file. 
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DATA  DESIGNER  will  respond  wita  the  number  of  views 
processed,  the  number  of  lines  real,  and  the  nunber  of  vali- 
dation errors,  if  any,  that  were  located: 

>VALIDATE  STUDENT. SUB  STUDENT. DIC 

DDVS0013I  Validation  begins. 

PDVS0024I  2  Views  were  processed. 

DDVS0025I  13  lines  were  read. 

DDVS0015I  0  validation  errors  were  detects!. 

Once  the  filss  are  successfully  /ali.dated,  the  user  will 
utilize  the  subschemas  to  generate  a  logical  database  design 
for  his  or  her  application. 

The  ten  3ENERATE  options  from  which  the  user  can  choose 
are  powerful  features  that  allow  the  usee  to  control  the  way 
that  DATA  DESIGNER  produces  a  design  and  supports  requests 
foe  varying  degrees  of  information  during  the  generation 
process.  If  the  3ENERATE  command  is  called  without  options, 
DATA  DESIGNER  will  create  a  design  that  removes  all  redun- 
dant data  elements,  generates  intersection  lata  groups  as 
necessary  to  resolve  many-to-many  relationships,  suppresses 
repeating  data  elements  within  data  groups,  generates  single 
key  data  groups  from  concatenated  keys,  and  considers  all 
frequency  and  timing  information  that  was  contained  in  the 
subschema  files.  In  all  cases,  the  end  product  of  the 
GENERATE  command  will  be  creation  of  a  $DES  file,  in  this 
case,  STUDENT. DES.  The  factory  user's  guide  recommends  that 
options  4,  5,  5,  9,  and  10  be  used  when  generating  the 
initial  desiga  or  after  major  revisions  to  the  input  files. 
A  brief  description  of  each  generate  option  is  shown  in 
Table  3.  Continuing  with  our  student  database  example,  the 
user*s  dialog  will  be 

>GENERATE  OPTION  1  5  6  9  10  T3  STUDENT. DESIGN 
DDGS0032I  Design  generation  begins. 
DDGS0058I  The  subschema  file  is  STUDENT. SUB 
DDGS0214I  Option  4  ignores  undefined  links. 
DDGS0281I  Option  5  generates  foreign  key  information. 
DDGS0301I  Option  6  generates  candidate  K.ey  information. 
DDGS0307I  Option  9  generates  cross-reference  info. 
DDGS0307I  Option  10  ignores  frequency  and  timing  info. 
DDGS005'4I  Design  generation  has  finished. 
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TABLE  3 
DATA  DESIGBEB  Generate  Dptions 


Option  Purpose 


1  Generate  unspecified  associations. 

2  Suppress  resolving  redundant  data. 

3  Suppress  creating  intersection  files. 

4  Supress  generating  inverse  links. 

5  Generate  foreign  key   information. 

6  Generate  candidate  key  information. 

7  Allow  repeating  data  items  In  groups. 

8  Suppress  generating  single  key  groups. 

9  Generates  cross- reference  information. 
10  Suppress  f rejnency/timing  information. 


i 


At  this  point,  the  logical  database  design  is  completed. 
When  using  the  options  specified  ia  the  example,  a  series  of 
reports  will  be  automatically  generated.    A  list  of  reports 


TABLE  4 
Reports  Available  with  DATA  DESIGNER 

Report  Type 

1  Data  Group  Links  Report 

2  Canonical  Schema  Report 

3  Data  Group  Index  Report 

4  Multiple  Occurences  Df  Data  Items. 

5  Data  Relation  Report 

6  Data  Group  Candiates  Ke/s  Report 

7  Data  Item  to  User  View  Gross-Reference 

8  User  View  to  Data  Group  Cr oss-Pef erence 

9  Data  Group  to  User  View  Gross-Reference 


created  is  contained   in  Table  '4.    To   print  these  reports, 
the  user's  dialog  will  simply  be 
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>REPORT  123f*56789  PRINTER  FRDM  STUDENT.  DESIGN 
DDP20073I  The  reports  were  printed. 

As  a   final  aid   in  evaluation   of  the   logical  database 

design,   DAT&  DESIGNER   is  capable  of  producing   diagrams  of 

(1|  an  overview  of  the  logical  datibase  design  and/or  (2)   a 

hierarchical   representation  of   that   logical  design.     To 

produce  the  logical  overview   diagram,   the  following  dialog 

is  required: 

>PLOT  " 
DDPT0289I       DATA  DESIGNER  Print  Plot  Ralsase  2.5A 

P>SET  TYPE  OVERVIEW 

P>SET  TITLE  LOGICAL-DESIGN 

P>DRAW  FROM  STUDENT.  DESIGN 
DDFS0310I  Design  STUDENT. DESIGN ' s  description  loaded. 
DDNX0271I  The  overview  plot  generation  is  done. 

P>RETURN 

P>END 

After  using  the  printed  reports  and  diagrams  to  evaluate  the 
database  design,  the  user  trill,  if  satisfied,  transcribe  the 
design  into  a  specific  DBMS  format,  such  as  AD&BAS,  or  use 
DATA  DESIGNEE'S  EDIT  capabilities  to  revise  tie  design  as 
necessary. 

As  discussed  in  Chapter  IV,  data  dictionaries  can  be 
evaluated  on  the  basis  of  their  accomplishment  of  security, 
integrity,  and  documentation/maintenance.  DAT&  DESIGNER,  as 
a  free-standing  data  dictionary  that  can  be  usel  in  conjunc- 
tion with  a  variety  of  DBMS  and  non-DBMS  systsms,  does  not 
address  the  security  aspect.  It  was  apparently  designed 
with  the  assumption  that  the  parent  system  with  which  DATA 
DESIGNER  interacts  will  handle  access  control  and  other 
security-related  functions. 

DATA  DESI3NER  does,  however,  receive  high  marks  for 
maintaining  data  integrity  and  for  the  guality  of  its  docu- 
mentation. Because  it  is  designei  to  support  the  develop- 
ment of  logical  database  designs,  it  utilizes  its  dictionary 
files  to  ensure  that  duplication  of  data  is  prevented 
through  generation  of  cross-reference  files.  When  a 
subschema   is  modified,    DATA  DESIGNER   again  utilizes   its 
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dictionary  files  in  the  subsequent  design  generations.  The 
PLOT  and  REPORT  functions  provide  a  wealth  of  information 
about  the  design,  its  components,  and  all  users  of  the 
subschemas.  Relationships,  both  those  included  by  the  user 
and  those  produced  by  DATR  DESIGNER,  can  be  sean  in  written 
reports  and  visual  represent  at inons.  When  modif icatons  and 
new  designs  are  produced,  the  reports  are  automatically 
updated  to  reflect  all  changes. 

B.   MSP  DATAMANAGEB 

DATAMANAG5R,  developed  by  MSP,  INC.  of  Lexington, 
Massachusetts,  is  one  member  of  the  MANAGER  family  of 
dictionary-oriented  software  products.  Other  products 
include  DESIGNM&N&GER,  PRO JECTMAN&GER ,  SOURCES  AN AGER,  and 
TESTMANAGER.  The  entire  line  of  products,  while  capable  of 
batch  operations,  is  designed  specifically  to  support  inter- 
active operations  with  IBM  360/3 70/30xx/4303  series  (and 
plug  compatible)  computers.  While  DATAMANAGE3  is  designed 
as  a  nucleus  for  further  expansion  or  specialization,  it 
provides  all  basic  capabilites  necessary  to  create  and  main- 
tain user  dictionaries.  Additional  capabilitas,  available 
as  a  series  of  extra-cost,  add-on  nodules,  include: 

1.  interfaces  to  IDMS,  ADABAS,  IHS,  TOTAL,  SYSTEM  2000, 
and  other  DBMS 

2.  teleprocessing  interfaces 

3.  generation  of  COEOL,  PL/I,   or  other  source  language 
data  descriptions 

4.  generation   of   DAIAMAN&GER   data   definitions   from 
existing  COBOL  or  PL/I  source  code 

5.  interfacing  of   a   DATAMAN^GER  dictionary   to   user- 
written  programs 

6.  status,  audit,  and  security  facilities 

7.  extensibility  through  a  user-defined  syntax  facility 
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DATAMANAGER  can  provide  data  dictionary  capabilities  to 
users  utilizing  a  variety  of  hardwire/software  combinations. 
By  providing  interface  modules  foe  several  popular  database 
management  systems,  DATAMANAGER  is  obviously  lore  flexible 
than  one  that  is  tied  to  a  single,  distinct  database  system. 
However,  DATA!!  AEJAGER1  s  flexibility  extends  beyond  the 
obvious: 

DATAMANAGER  is  intended  foe  use  in  any  "organization  in 
which  there  is  a  computerized  dita  processing  function. 
Its  use,  howevar<  is  not  confiaed  to  those  elements  of 
data  that  are  held  in  computer  files  or  that  are  acted 
upon  by  computerized  systems.  Definitions  of  all  data 
held  and  used  by  an  organization,  in  its  manual  systems 
as  well  as  its  computerized  ones.  can  be  held  in  a 
DATAMANAGER  data  dicionary.  DATAMANAGER  is  designed  to 
be  used  both  with  traditional  files,  powerful  database 
systems,  and  in  a  mixed  environment.  Ose  of  the  data 
dictionary  remains  independent  of  the  database  manage- 
ment system,  although  further  add-on  facilities  enable 
DATAMANAGER  data  definitions  to  be  generated  directly 
from  the  database  data  description  language  source 
coding.   [ Ref .  21] 

The  architecture,  or  structure,  of  the  DATA1ANAGER  data 
dictionary  is  composed  of  four  (or  fiire)  data  files,  called 
data  sets  in  the  user  documentation. 

The  source  data  set  contains  the  data  definitions  as 
originally  input  into  the  system  by  the  user.  when  the  user 
modifies  or  appends  changes,  the  data  definitions  are  auto- 
matically updated  within  the  file. 

The  data  entries  data  set  contains  all  encoded  data 
definitions  generated  by  DATAMANAGER  after  evaluating  the 
contents  of  the  source  data  set.  Data  definitions  are 
encoded  to  reduce  the  time  required  for  DATAMANAGER  to 
process  the  information  within  the  data  dictionary.  During 
this  encoding  process,  relationships,  aliases,  and  classifi- 
cations are  also  identified. 

The  index  data  set  is-  an  automated  index  containing  the 
name  and  addcess  of  each  entity  definition  that  is  in  the 
source  data  on  data  entries  data   sets.    The  iniex  data  set 
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serves  as  a  lata  directory  to  support  the  fastest  possible 
retrieval  of  entity  definitions  and  associated  lata. 

The  error  recovery  data  set  is  used  by  the  system  as  a 
temporary  backup  storage  file.  This  capability  was  imple- 
mented to  increase  reliability  by  proviiing  for  automatic 
recovery  of  the  dictionary  contents  in  the  case  of  external 
interruption  or  other  system  failure  during  a  dictionary 
update. 

The  log,  lata  set  is  an  optional  capability  that  is 
highly  recommended  by  MSP.  All  updating  commanis,  associ- 
ated data  definitions,  and  amendments  ace  logged  into  that 
file  as  they  occur.  Entries  inclule  command  ilentif ication, 
fuLl  date,  time,  user,  and  statjs  of  all  physical  input/ 
output  accesses.  Additionally,  the  lata  administrator  has 
the  option  of  specifying  that  all  commands  dicected  to  the 
data  dictionary  be  logged.  When  combined  with  other  system 
backup  facilities,  this  allows  DATAM&NAGEH  to  be  "rolled" 
forward  from  the  last  backup  point  in  case  fulL  recovery  is 
ever  required. 

DATAMANA3ER  is  a  powerful  system  that  utilizes  a  series 
of  interactive  commands  to  create,  maintain,  and  document 
data  dictionary  contents.  These  standard  commands  are 
listed  in  Table  5.  DAIAMANAGER  provides  a  predefined  series 
of  standard  entity-types,  relationship-types,  and  attribute- 
types  that  form  the  system  stanlard  schema.  These  are 
listed  in  Table  6.  As  shown  in  Table  6,  DATAMANAGER  uses 
only  six  entity-types  in  the  standard  schema.  Those 
elements  exist  within  the  system  as  members  of  a  logical 
hierarchy  as  shown  in  Figure  5.1.  Discussion  in  the  user 
documentation  reveals  that  DATAHASAGES  strives  to  provide 
the  capability  to  maintain  all  system  data  while  maintaining 
ease  and  simplicity  of  logical  design. 
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TABLE  5 
DATAHANAGER  Standard  Commands 


ADD  ALSO  ALTER 

AUTHORITY  BULK  COPY 

DICTIONARY  DOES  DROP 

ENCODE  ENDDMR  FORMAT 

GLOSSARY  INSERT  KEEP 

LIST  MODIFY  PERFORM 

PRINT  PROTECT  REMOVE 

RENA3E  REPLACE  REPORT 

SHOK  STATUS  WHAT 

WHICH  WHO  WHOSE 


TABLE  6 
DATAM&NAGER  Standard  Schena  Descriptors 

PROCESS  ENTITY-TYPES 

MODULE       PROGRAM        SYSTEM 
DATA  ENTITY-TYPES 

FILE         3ROUP  ITEM 

RELATIONSHIP-TYPES 

SEE 

ATTRIBUTE-TYPES 

ACCESS-AUIHOEITY  ADMINISTRATE  E-DATA 

ALIAS  CATALOGUE 

COMMENT  DESCRIPTION 

EFFECTIVE-DATA  FREQUENCY 

NOTE  OBSOLETE-DATE 

QUERY  SECURITY-CLASS 


A  complete  specif icatioa  of  tha  data  resource  of  an 
organization  reg^uires  the  definition  of  the  characteris- 
tics and  of  the  interrelationships  of  iata,  and  of  the 
contexts  in  which  the  data  is  used.  Accociingly,  the 
design  of  DATAMANAGER  provides  foe  a  hierarchy  of  member 
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types,  within  which  it  is  possible  to  describe  ail 
elements  and  assemblages  of  data  and  the  prD^essess  that 
act  on  the  lata.  The  number  of  nember  types  iefined  for 
the  basic  hierarchy  has  been  kept  as  small  as  possible 
rfhile  meeting  these  requirements.   [Sef.  22] 


SYSTEM 

........ 


PRD3RAM 


MODOLE         J 
I 


DATABASE 


FILE 


;RO(JP 


ITEM 


Figure  5.1    DATAMANAGER's  Hierarchy  of  Entity-types 
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At  the  lowest  level,  an  ITEM  is  a  fundamental  element  of 
data,  the  smallest  unit  within  DATAMANA3ER.  A  GROUP  is  a 
collection  of  items  or  other  groups.  The  thirl  entity-type, 
the  FILE,  can  either  be  implemented  as  a  traditional  file 
organization  (a  collection  of  data  groups,  independent  of  a 
DBMS)  ,  or  as  the  equivalent  association  of  3ata  within  a 
database.  If  DATAMANAGER  is  used  with  a  database,  another 
entity-type,  DATABASE,  will  be  provided  with  the  database 
interface  module,  e.g.,  ADABAS,  that  is  selected.  The  new 
member,  in  this  case,  ADABAS-DATA3 ASE,  will  either  replace 
the  FILE  element  within  the  hiecacchy,  oc  coexist  by 
residing  between  the  FILE  and  MODULE  elements.  A  MODULE  is 
a  collection  of  data  that  includes  descriptions  of  a  data- 
base (if  used)  ,  FILEs,  GROUPS,  and/or  ITEMs.  The  module  is 
the  lowest  unit  that  can  directly  or  indirectLy  manipulate 
data,  and  is  a  subdivision  of  a  PROGRAM.  The  PROGRAM  is 
defined  in  terms  of  collections  of  modules  and  those 
processes  that  input  or  output  data  to/from  the  system.  A 
program  is  executable.  A  SYSTEM  is  the  highest  element  of 
the  DATAMANA32R  hierarchy  and  contains  all  subordinate  data 
declarations. 

While  DAIAMANASER  stresses  simplicity  in  the  logical 
design  of  the  system  standard  schema,  it  can  be  configured 
to  be  highly  extensible.  An  add-on  module,  the  User  Defined 
Syntax  Facility  (UDSF),  is  require!  to  support  user  declara- 
tion of  schema  descriptors.  If  present,  this  facility 
provides  several  unique  capabilities.  First,  in  addition  to 
allowing  the  user  to  define  his  or  her  own  entity-types,  the 
module  allows  the  data  administrator  to  insert  one  (or  more) 
of  three  standard  sets  of  extended  entity- types.  These  sets 
are: 

1.  The  Extended  Data  Processing  Structure  (EDPS)  which 
provides  additional  entity-types  freguently  used 
within  the  data  processing  environement.  These 
include  PROCEDURE,  SUBROUTINE,  and  DATASET. 
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2.  The  Structured  Aaal^sis  Structure  (SAS)  which 
provides  entity-types  frequently  used  when 
conducting  structured  design.  These  include 
SUBPRDCESS  and  DAT&STR UCTUR E. 

3.  The  Structured  Develo2nent  Structure  (SDS)  which 
strives  to  provide  all  sntity-types  accessary  to 
satisfy  the  requirements  of  the  majority  of  poten- 
tial users.  This  collection  of  entity-types  include 
all  those  found  in  the  EDPS  and  SAS  subsets. 

Second,  the  UDSF  module  supports  user  definition  of 
attribute-types  related  to  both  system  standard  and  user- 
created  entity-types.  Three  distinct  categories  of 
attribute-types  are  recognized  within  DATABASE GER.  These 
are: 

1  •   Global  (common)   at tribute^types  which  *ill  apply  to 
all   entity-types    within   the    structure,    e.g., 
SECURITY-CODE. 
2.   Generic  attribute- types  which  can   be  added  to  those 
of  a   specific  standard   entity-type,   for   example, 
FILE.     Whenever   a   user   defined   entity-type  is 
created  that  uses  the   standard  en tity- type's  format 
as  a  base,   the  generic  attribute- types  of  the  stan- 
dard entity-type  will  be  passed  into  tla  new  entity- 
type. 
3-   Sp_ecific  attribute-  types  which  allow  the  designer  to 
tailor   an   entity-type  to   satisfy   the   particular 
requirements  of  that  organization. 
Finally,    the   UDSF   module  supports   user   definition   of 
relationship- types  in  both  forward  and  backward  directions. 
This  enables   DATAMANAGE8  to   support  the   three  (or   four) 
relationship  mappings  we  have  previously  described. 

Once  DATAAANA3ER  is  installed  on  the  computer,  two  major 
steps  must  be  conducted  before  information  can  be  entered  in 
the  data  dictionary.    First,   an  empty  data  dictionary  must 
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be  defined  using  Controller  commands  (restricted  to  use  by 
the  data  administrator)  ,  DICTIONARY,  aai  AUTHORITY. 
Briefly,  the  dictionary  must  be  created  and  opened, 
authority  levals  lust  be  defined,  and  potential  users  must 
be  identified.  As  the  second  major  impleman tation  step, 
member  entity- types,  both  standard  and  user-created,  must  be 
be  defined.  Every  session  with  DATAMAN A3ER  is  conducted  as 
a  "run",  in  which  a  series  of  systam  commands,  specified  by 
the  user,  are  carried  out.  Every  session  must  initiate  with 
the  commands  DICTIONARY  and  AUTHORITY.  After  ceview  of  the 
user  documentation,  this  process  will  probably  seem  diffi- 
cult and  confusing  to  most  users,  even  to  those  who  have 
worked  with  other  data  dictionaries.  DAIAMANAGER  is, 
however,  an  impressive,  powerful  package  in  the  hands  of  an 
experienced  user.  Our  sample  database/  STUDENT,  would  be 
entered  as  a  FILE  (or  DATABASE,  if  implemented)  .  The  format 
for  an  individual  student1s  record  becomes  a  GROUP,  in  which 
each  data  element,  e.g.,  service,  SSN,  etc.,  becomes  an 
ITEM.  The  structure  of  ouc  example,  after  implamentation  in 
DATAMANAGER,  would  appear  as  shown  in  Figure  5.2. 

DATAMANA3ER  aggressively  supports  each  of  the  three 
objectives  of  data  dictionary  usage:  data  integrity, 
security,  and  maintenance/documentation.  It  enforces  data 
integrity  through  its  hierarchical  structure  of  entity- 
types,  predefined  standard  schema  relationships,  identifica- 
tion of  aliases,  and  automatic  update  procedures.  System 
definitions  and  error-checking  are  used  to  yalidate  the 
structural  "correctness"  of  each  entity,  relationship,  and 
attribute  as  it  is  created  or  defined.  Once  the  FILE  or 
DATABASE  is  defined,  DAIAMANAGER  monitors  input  of  data  into 
system  structures  by  comparing  the  input  to  ths  appropriate 
ITEM'S  characteristics.  Each  of  the  SSP  products,  including 
ths  DBMS  interfaces,  displays  evidence  that  ISP  recognizes 
the  importance  of  data  integrity  as  a  vital  link  to  effi- 
cient and  dependable  control  of  data. 
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00205  PRODUCE    STUDENT    LAYOUTS, 

00206  PRINT    GIVING    DESCRIPTIONS. 

****************************************************** 

*  DESCRIPTION  OF  STUDENT  * 
*********  4c*****  ************  *************************** 

*  LEVEL   NAME  LEN    IlfPE    REMARKS  * 
****************************************************** 

*  1     STUDENT         069    GROUP   STUDENT  * 

*  * 
****************************************************** 

*  2     STUDENT-NAME    050    C3AR    50  DIG/ALPH-NUM  * 

*  * 
****************************************************** 

*  2  STUDENT-SSN  011  CHAR         3    NUM,"/",  * 

*  2  NUM."/1'  4  NUM  * 
****************************************************** 

*  2     STUDENT-SERV    005    OHAR    05  DIG/ALPH-NUM  * 

*  * 
****************************************************** 

*  2     STUDENT-RANK    003    03AR    1  CHAP.,"-",      * 

*  1  NUM  * 
****************************************************** 


Figure  5.2    STUDENT  example  in  DATAMANAGER  Structure 

The  DATAMANAGER  nucleus  provides  security  by  inclusion 
of  one  type  of  security  mechanism,  password  control.  The 
Controller,  or  dictionary  administrator,  must  assign  a 
unique  password  to  each  authorized  user.  Each  user  and 
password  combination  must  be  registered  within  the 
dictionary.  DATAMANAGER  will  reject  any  command  session 
which  does  not  commence  with  an  AJTH0RITZ  command  followed 
by  an  authorized  password. 

Several  additional  security  machanisms  can  be  provided 
by  including  the  Audit  and  Security  Facility  module  in  the 
system  implemantation.  First,  the  Controller  gains  the 
capability  of  registering  general  and  specific  security 
levels  within  the  dictionary.  Ea;h  usee  may  be  assigned  a 
general  security  level  in  addition  to  the  unique  password 
previously  assigned.  Within  the  system,  the  Controller  will 
assign  a   specific  Insertion  Security   Level  and   a  specific 
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Protection  Security  level.  A  usee  whose  general  level  is 
lower  than  the  specific  insertion  level  is  not  allowed  to 
insert,  modify,  or  delate  information  within  the  data 
dictionary.  This  provides  the  capability  to  assign  "read 
only"  access.  A  user  whose  general  level  is  lower  than  the 
specific  protection  level,  or  one  who  does  not  have  a 
general  security  level  assigned,  is  not  allowed  to  establish 
protection  for  system  members,  or  lata  structures. 

Second,  users  who  do  have  a  geaeral  security  level  equal 
to  or  higher  than  the  specific  protection  level  may  use  tne 
PROTECT  command  to  assign  protection  to  specific  members  in 
the  form  of  ACCESS,  ALTER,  and  REM07E  security  levels.  This 
capability  allows  key  users  to  coatrol,  or  even  prohibit, 
a.ccQSS  to  those  structures  that  they  own.  any  member  which 
is  not  owned  but  does  require  security  can  be  assigned  the 
same  three  control  levels  by  the  dictionary  administrator. 

Finally,  the  Audit  module  provides  the  capability  to 
produce  over  500  different  audit  reports,  using  information 
contained  within  DATAMANAOER.  The  majority  of  these  reports 
are  reserved  for  use  of  the  dictionary  administrator  alone. 
This  includes  the  capability  of  logging  all  conmands  issued 
to  the  system.  This  "trace"  mechanism  increases  security  by 
providing  a  record  of  all  entries,  or  attempted  entries,  to 
the  system. 

The  last  significant  objective  of  i  data  dictionary  must 
be  to  support  maintenance  and  docimentation  of  the  informa- 
tion contained  within  the  information  system.  DATAMANAGER 
provides  a  set  of  commands  unique  to  the  maintenance  func- 
tion. A  listing  of  these  is  shown  as  Table  7.  Maintenance 
can  be  supported  during  both  interictive  and  bitch  sessions. 
A  series  of  query  and  report  commands  are  provided  with  the 
nucleus  module  to  support  usage  studies,  maintenance,  and 
documentations.  These  commands  are  listed  in  Table  8.  The 
REPORT,  PRINT,  ani  GLOSSARY  commands  provide  a  great  deal  of 
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TABLE  7 

DATAHANAGER  Maintenance 

Commands 

INSERT 

MODIFY 

ENCODE 

REPLACE 

BULK  ENCODE 

COPY 

ADD 

RENAME 

ALTER 

REM07E 

KEEP 

DROP 



ALSO  KEEP 

PERFORM 

TABLE  8 
DATAHANAGER  Report/Query  Commands 

Report  Commands 


PRINT 

SWITCH 

GLOSSARY 

TEXT 

BULK  REPORT 

REPORT 

SPACE 

Query  Commands 

LIST 
SKIP 
BULK  PRIST 

WHAT 
WHOSE 

WHD 

DOES 

WHICH 
SHOW 

information  to  the  dictionary  adminsistrator  and  other 
designated  usees.  When  system  data  is  modified,  the  query 
ani  report  commands  can  be  used  t3  provide  updated  documen- 
tation and  records. 

One  additional  DATAHANAGER  capability  warrants  mention 
with  respect  to  maintenance  and  documentation.  One  system 
entity-type  which  has  not  been  discussed  and  does  not  reside 
in  the  hierarchy  shown  earlier  is  the  COMMAND-STREAM  entity- 
type.    This   structure  is  a   unique  feature   of  DATAHANAGER 
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that  allows  previously  stored  secies  of  commands  to  be 
executed  by  using  the  PERFDRtl  command.  The  use  of  specific 
CDSMAND-SIREAMs  can  be  compared  to  the  subroutines  of  a 
general  programming  lanugaje-  While  the  COMMAND-STREAM  can 
be  used  in  many  ways  within  DATAMANA3ER,  it  becomes  espe^ 
cially  useful  during  generation  of  reports  and  documentation 
during  maintenance  sessions.  A  "subroutine"  can  be  speci- 
fied that  will  produce  all  standard  reports;  when  system 
information  is  updated,  the  applicable  reports  are  produced 
by  one  simple  PERFORM  command  at  the  end  of  the  maintenance 
session. 

C.   ADR  DATADICTIDNARY 

DATADICTIONARy.  is  one  of  fourteen  separate,  but  highly 
integrated,  software  products  produced  by  Applied  Data 
Research,  Inc  (ADR).  Initially  introduced  in  1978,  the 
integrated  system,  Relational  Information  Management 
Environment  (RISE),  is  considerei  to  be  one  of  the  first 
true  examples  of  the  fourth  generation  of  systems  software. 
An  article  in  Infosy_stems  states 


Three  conditions  are  certain  in  the  1 983s. ...  First , 
applications  packages  will  not  meet  the  need  for  most 
applications  that  will  be  computerized.  Second,  systems 
software  products  that  improved  productivity,  reduced 
application  costs  and  increased  accessibility  to  infor- 
mation in  the  1970s  will  be  even  more  valuable  in  the 
1980s.  And  third,  existing  applications  will  not  be 
readily  rewritten  or  replaced  and  will  have  to  be  main- 
tained for  many  years....  The  success  or  failure  of  many 
organizations  in  the  1930s  wilL  depend  on  how  effec- 
tivelv  they  improve  and  integrate  data  prscessing  in 
their" operations.  This  is  particularly  critical  for 
Drganizations  that  have  been  traiitional  data  processing 
users  over  the  last  20  years  and  hare  worked  with  second 
and  third  generation  mainframe  hardware  and  software 
systems.   [ fief.  23  ] 

Prior  to  analyzing  ADR's  data  dictionary,  it  is  important  to 
review  briefly  the  objectives  of  fourth  generation  software 
and  integrated  systems  and  to  provide  an  overview  of  RIME. 


65 


Each  of  the  "generations"  of  system  software  can  be 
identified  by  one  or  more  significant  advancements.  The 
first  generation  provided  primarily  assenoly  language 
programs.  The  second  generation's  gifts  centered  around 
development  of  high-level  languages  and  improved  operating 
systems.  Numerous  advances  surfaced  daring  the  third  gener- 
ation, e.g.,  database  management  systeas,  data  dictionaries, 
structured  programming  techniques,  early  efforts  at  decision 
support  systems,  and  program  generatocs.  During  the  fourth 
generation,  it  is  anticipated  that  advances  will  occur  in 
three  primary  areas:  very  high-level  languages,  relational 
database  management  systems,  and  the  automated  office  or 
integrated  information  center.  In  the  latter,  all  automated 
functions,  including  data  processing,  word  processing,  data- 
base and  file  management,  decision  support,  program  develop- 
ment and  maintenance,  and  communications,  will  be  combined 
into  one  "total"  system.  This  could,  in  theory,  be  accom- 
plished by  one  giant  program,  or,  in  the  case  of  ADR  and 
other  vendors,  as  a  series  of  smalLer,  integrated  packages. 

During  1932,  the  D.  S.  Army  awarded  a  contract  for  the 
largest,  most  complex  information  processing  project  ever 
funded  by  the  government.  Naned  VIABLE  (Vertical 
Installation  automation  Baseline) ,  the  project  will  provide 
a  nationwide  automated  network  that  will  connect  forty-seven 
miLitary  bases  to  massive  computer  power  at  five  regional 
data  processing  centers.  The  network  has  been  designed  to 
support  the  management  of  information  in  peacetime  and  in 
times  of  war  and  other  national  emergencies.  During  the 
planning  period,  interest  centered  on  three  principal  func- 
tional areas:  communication,  interactive  program  develop- 
ment, and  database  management.  The  primary  contractor. 
Electronic  Data  Systems,  selected  11  of  ADR's  products  for 
use  as  the  base  of  the  VIABLE  system.  A  complete  list  of 
ADR/RIME  elements  is  included  as  Table  9. 
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TABLE  9 
Components  of  ADR's  DMCDS  System 


Component 


DATACOM/DB 

DATA  DICTIONARY 

DATA2UERY 

DATAREPORTER 

DATAENTRY 

COBDL/DL 

LIBRARIAN 

ROSC3E 

LOOK 

METAZOBOL 

AUTDFLDW  II 

ADR/D-NEI 

ADR/EMAIL 

ADR/IDEAL 


Function 


Relational  Database  System 
Resource  Control  System 
English-like  Query  Language 
Into.  Retrieval/Reporting 
On-line  Data  Entry  System 
Extended  Language/Utilities 
Program  Management  System 
Program  Maintenance  system 
Real-time  Measurement  System 
Language  Pre-compiler 
System  Development  IodI 
Distributed  Database  Network 
Electronic  Mail  System 
Interactive  Develop.  System 


*  New  ADR  products  which  ha/e  not  yet  been 
included  in  the  Army's  VIABLE  praject. 


Some  of  these  elements  can  be  considered  to  be  high- 
priced  extras  or  application-specialized  options.  If  an 
organization  were  to  utilize  all  components,  users  would 
have  access  to  a  complete  database  system'  with  data 
dictionary,  a  relational  query  language,  report  and  graph 
generators,  extended  COBOL  compiLer,  program  development 
support,  distributed  local  data  network,  electronic  mail 
system,  and  mDre. 

According  to  ADR  literature,  the  heart  of  the  integrated 
system  is  DAIADIcriONARY.  The  company^  database  system, 
DATACOM/DB,  a  true  relational  database3  system  that  utilizes 
a  patented  flexible  data  structure,  was  designed  especially 
to  interact   with  DATADICTIDNARY.    As  an   active  dictionary 


3A  relational  database  is  one  in  which  the  relationships 
between  data  are  implied  by  the  values  of  the  data.  For 
example,  two  records  are  relatel  if  they  have  the  same 
attribute,  as  STJDSNT  and  PROFESSOR  are  related  by  the  fact 
that  they  are  associated  with  a  particular  CLASS. 
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system,  DATADIOTIDNARY  is  queried  by  ail  other  components  of 
the  system  prior  to  access  of  system  iaf orina tion.  This 
maximizes  data  integrity  while  minimizing  data  redundancy. 
DATADICTIONARI  offers  a  menu-driven  user  interface.  It 
provides  security,  supplies  full  documentation/maintenance 
capabilities,  and  can  be  extended  to  iDteract  with  future 
system  products  and  to  support  future  user  requirements. 
The  documentation  provided  with  DAT ADEiriONARY  and  other  ADR 
packages  is  almost  overwhelning  in  its  completeness.  The 
dictionary  alone  has  fifteen  separate  volumes.  While  an 
extremely  capable  system,  DATADIT TIONARZ  is  aot  one  that 
will  be  easily  or  quickly  mastered. 

DATADICTION&R?  provides  20  standard  entity-types  in  its 
system  standard  schema  and  supports  user  creation  of  addi- 
tional, more  application-specific  schema  descriptors.  For 
most  applications,    the  standard  types   listed  in   Table  10 


TABLE  10 

ADB  DATADICTI08ARY  Standard  Entity-types 

DATABASE      KEY         SYSTEM  REPORT 

AREA  ELEMENT     PRO0RAM  JOB 

FILE  LIBRARY     MODJLE  STEP 

RECORD         MEMBER      DAIAVIEtf  AOT  3  OR IZ ATION 

FIELD  PANEL       PERSON  NODE 


will  prove  to  be  sufficient.  DRTADICTIONAR?  maintains  a 
logical  hierarchy  among  the  principle  staadard  entity-types, 
as  indicated  in  Figure  5.3.  Many  of  the  staodard  entity- 
types  are  provided  with  primary  relationships  already 
defined  with  key  subordinate  entity- types.  For  example,  in 
ouc  STUDENT  example,   we  will   initially  use  tti3  entity-type 
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Figure  5„3    A  Logical  Hierarchy  of  Entity-types 

DATABASE  to  create  our  sample  dataaasa.  When  we  define  the 
database  entity-occurrence,  the  DATABASE-AREA  relationship 
is  automatically  provided.  Similarly,  whan  the  area- 
occurrence  is  defined,  tha  AREA-FELE  relationship  is  estab- 
lished by  DATADICTIDNARY.  In  the  case  of  RECORD,  creation 
of  an  occurrence  provides  three  relationship- types: 
RECORD-FIELD,  RECDRD-KEY,  and  REZDRD-ELEI ENT.  These  three 
relationships,  at  the  lowest  level  of  the  logical  hierarchy, 
support  actual  entry  of  attribute-valiias,  or  3ita.  Whether 
system-defined  or  user-craa ted,  all  relationship-types  in 
DATADICTIONARY  have  four  attributes: 
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1 •  Relationship- mapping  --  describes  tha  nunber  of 
entity-occurrences  which  are  tha  subjects  and  the 
objects  of  this  relationship,  e.g.  the  t^£e  of  the 
relationship.  DATADICTIONARX  supports  four  types  of 
relationship  mappings,  i.e.  one-to-one,  one-to-many , 
many-to-one,  and  many-to-many. 

2-  Required- relation ship  —  describes  whether  each 
entity-occurrence  in  the  named  object  entity-type  is 
to  be  related  to  at  least  one  entity-occurrence  of 
the  r.aaed  subject  entity- type. 

3-  Automat ic-relationshi2  -  describes  waether  each 
entity-occurrence  of  the  named  object  antity-type  is 
to  be  automatically  related  to  an  entity-occurrence 
of  the  named  subject  entity-type  when  the  object  is 
added. 

4 •   Ordered-relatiopship  -  describes  whether  the  order  of 
relationships   added   in   this   relationship-type   is 
significant.   An  ordered-relationship   allows  entity- 
occurrences   to  be   retrieved   and   displayed   in   a 
specific  order. 
If  using  the  interactive  version,  DATADICTIDNARY  Online, 
the  user  will  be  prompted  by   a  series  of  panels,   o::  menus. 
The  Master  Menu  is  displayed  in  Figure  5.4.   Th2  master  menu 
supports  creation,   modification,    and  deletion   of  entity- 
occurrences.    Additionally,  it  provides  access  to  a  Li  other 
system  menus  through  option   (7).    The  following  procedures 
would   be  utilized   to  create   the  STUDENT   example   within 
DATADICTIONARI.    First,  the  Add  Datail  routine,  option  (2)  , 
is  selected.     In  answering  the   system  prompts,    the  user 
creates  the  naw  entity-occurrence,    DATABASE. STUDENT  in  the 
following  dialog: 
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=> 

DDOL:  SELECTION  CRITERIA  FOR  DETAIL  ADD 

LV   TY   ENTITY    RECORD   DD  OCCURRENCE  NAME   VE1   STAT 

00    E   DATABASE  STUDENT  001 

CURRENT  OCCURRENCE      QUALIFIER: 
DATABASE  STUDENT  (00  I)  TEST 

************************  ******************  fr  *****  ****** 

DETAIL  ADD 
ATTRIBUTE       VALUE 

DESCRIPTION  NPS  STUDENT  DATABASE 

CONTROLLER  DEPARTMENT  OF  RE3ISTRAR 

AUTHOR  REGISTRAR 

BASE-ID  001 

BASE-TYPE  ADR/DB 

DBMS-USED  RELATIONAL 


=  > 

MASTER  MENU 

ENTER  THE  REQUESTED  OPTION  ==>  THERE  ARE  03  OPTIONS 

1.  DISPLAY  MENU    MENU  FOR  DISPLAY  FUNCTIONS 

2.  ADD  DETAIL      ADD  DETAIL  ENTITY- OCCURS ENC2 

3.  DELETE  DETAIL   DELETE  DETAIL  ENTITY-OCCURRENCE 

4.  UPDATE  DETAIL   UPDATE  DETAIL  ENTITY-OCCUR ENCE 

5.  COPY  COPY/MODEL  ENTITY-OCCURRENCE 

6.  STATUS  CHANGE   CHANGE  ENTITY-OCCURRENCE  STATUS 

7.  SUPPORT  MENU    ALIAS,  DESCRIPTOR,  RELATIONSHIP, 

TEXT   AND  OL T5 T 
OCCURRENCE  SECURITY  MAINTENANCE 


8.  SECURITY 


Figure  5-4    ADR  DATADICTIONARY  Master  Menu 

Each  of  the  20  standard  entity-typas  wLll  contain  predefined 
key  attributas.  Values  for  thess  at  tribute-types  are 
entered  during  the  Add  Detail  routine.  In  the  case  of  the 
DATABASE  entity-type,  and  as  was  shown  above,  the  key  attri- 
butes are  DESCRIPTION,  CONTROLLER,  AUTHOR,  BASE-ID, 
BASE-TYPE,  and  DBMS-USED. 

In  similar  fashion,  the  user  must  create  the  subordinate 
logical  structures,  AREA. STUDENT,  FI LE. STUD ENT ,  and 
RECORD. STUDENT.    As  each  occurrence  is  created,   it  must  be 
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reLated  to  ta=  next  highest  entity-occurrence  in  the  logical 
hierarchy,  e.g.,  FILE. STUDENT  must  be  related  to 
AREA . STUDENT.  For  this  process,  the  usee  invokes  the 
Relationship  Definition  Panel  to  lefine  the  cb lationshi ps. 
DATADICTIONARY  will  respond  with  the  Relationship  Definition 
Display  which  presents  the  characteristics  of  each  of  the 
relationships  as  it  is  enacted.  Examples  of  these  panels 
are  shown  below: 

=> 

DDOL:   RELATIONSHIP  DEFINITION 
RELATIONSHIP  NAHE       $INTERNAL 
SUBJECT  ENTITY  TYPE     DATABASE. STUDENT 
OBJECT  ENTITY  TYPE      AREA. STUDENT 

=> 

RELATIONSHIP  DEFINITION  DISPLAY 

SELECTION: 

$INTERNAL  DATABASE.  STUDENT  AREA. STUDENT 

NAME  SUBJ  TYPE   OBJ  TYPE  MAP   REQ   AUTO   OPDER 

$INTERNAL  DATEBASE      AREA  1M    Y     N      N 

As  a  final  step  in  installing  the  STUDENT  database,  QLIST 
commands  must  be  used  to  define  spscific  fields,  keys,  and 
elements  within  RECORD. STUDENT.  This  is  the  point  where  the 
specific  attributes  of  the  STUDENT  example,  e.g.,  SSN,  Name, 
Secvice,  and  Rank,  are  entered  into  ths  database  design. 
The  user  defines  attribute  name,  parent,  class,  type, 
length,  and  number  of  repetitions.  One  example  of  this 
process  is  as  follows: 

=> 


DDOL:  SELECTION  CRITERIA  FOR  RECORD  QLIST  SAINT 

LV   TY   ENTITY    RECORD   DD  OCCURRENCE  NA3E   VER   STAT 

00    E   RECORD  STUDENT  TEST 

CURRENT  OCCURRENCE      QUALIFIER: 

RECORD  STUDENT  (001)  TEST 

a************************************ *******  ******** ** 

RECORD  2LIST  MilNTENANCE 

E  FC  FIELD  NAME   PARENT  NAME   INSERT  AFT   C  T  LEN  REP 
A   SERVICE      SSN  NA3E         S  C  004  001 
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Looking  at  the  Last  line  of  the  figure,  the  user  has  indi- 
cated the  following: 

FC  (function  code)  =  Add  a  field 
Field  Name  =  SERVICE 

Parent  Name  =  SSN  Kn  this  case,  this  is  tae  Key  field) 
Insert  After  =  NAME  (NUMBER'S  value  will  fallow  NAME'S) 
C  (Class)  =  Simple  (as  opposel  to  a  compound  field) 
T  (Type)  =  Character  (vice  a  numeric  or  binary  field) 
LEN  (Length  of  Field)  =4 

REP  (Number  of  Repetit ions)  =  D01  (vice  a  repeating 

field) 

At  this  point,  the  schema  of  STUDENT  has  been  entered  into 
DATADICTIONAR?.  The  user  may  now  use  DAIACOM/DB  facilities 
to  enter  attribute- values  into  the  system.  Upon  completion, 
the  database  administrator  or  authorized  users  can  create  as 
many  external  views,  or  subschemas,  as  desired. 

DATADICTIONAR?  receives  high  marks  in  the  areas  of  data 
integrity,  security,  and  documentation/maintenance. 
DATADICTIONARY' s  logical  hierarchy  of  structures  and  system- 
atic installation  procedures  tend  to  enforce  data  integrity. 
The  dictionary's  extension  routines  and  vi3w  generation 
processes  have  been  written  to  ensure  that  data  integrity  is 
maintained  throughout  expansion  or  specialization  of  the 
database.  To  enforce  security,  DAT ADICTIOtJ^RY  provides 
muLtiple  layers  of  protection.  Two  separata  and  independent 
mechanisms  are  provided  in  all  implementations.  These  are 
(1)  use  of  entity  passwords,  and  (2)  inclusion  of  locks  and 
override  codes.  If  the  installation  is  the  Online  version, 
a  third  mechanism,  user  validation,  is  available.  As  each 
entity  is  created,  or  at  any  time  afterwards,  a  four-digit 
password  can  be  assigned  to  that  entity.  Passwords  can  be 
either  unique  or  assigned  to  a  series  of  related  entities. 
Any  user  attempting  to  modify  or  access  a  password- protected 
entity-occurrence  will  be  gueried  to  provide  the  applicable 
password  prior  to  gaining  access.  The  second  layer  of 
protection  centers  on  use  of  LDCS  and  OVERRIDE  codes. 
Unlike  passwords,  which  either  allow  or  proiibit  access, 
lock  codes  can  be  utilized  to  limit  the  degree  of  access 
granted.   Three  levels  of  security  are  provide!: 

73 


LOCKO  No  restrictions  exist  on  an  entity 

(default  setting) 

LOCK1  The  entity  cannot  be  update!  or  deleted 

without  an  override  -ode.   The  entity 
can  be  copied,  displayed,  or  printed 
witnout  restrictions. 

LOCK2  No  action  will  be  permitted  unless  the 

override  code  is  given  to  tie  system. 

The  actual  override  codes  will  be  lsed  dictionary-wide,  that 
is,  a  single  code  will  exist  to  satisfy  LOZS.  1  conditions 
while  another  code  exists  to  access  entities  protected  by 
LD3K2.  Finally,  if  using  DATADICTIONAR Y  Online,  the  highest 
layer  of  security  becomes  user  validation.  The  name  of  each 
user  of  the  system  is  defiQed  as  a  PERSON  entity-type.  Each 
entity-occurrence  will  include  a  unique  passwDcd  which  must 
be  provided  to  eater  the  system  through  the  Dnline  inter- 
face. Four  levels  of  authorization  are  supported  by 
DATADICTIONARY: 

_DIS     The  user  is  allowed  to  display  all  data  in 
the  dictionary. 

_UPD     The  user  is  allowed  to  update  the  dictionary. 

_COP     The  user  is  allowed  to  copy  an  entity. 

_ADM     The  user  is  allowed  the  use  of  all  commands 
and  is  allowed  to  process  all  panels. 

Authorization  at   one  level   will  automatically   provide  all 

lower  authorizations. 

ADR's  multiple-layered   approach  to  security   provides  a 

system  that  is   both  highly  flexible  and   very  secure.    The 

database   administrator  will   be   ible   to  provide   whatever 

degree  of  access  that  is  required  to  each  individual  user  as 

well  as  to   each  group  of  users  within  the   system.    If  one 

layer  of  security  is  broken,  access  will  be  prevented  by  the 

other  security  mechanisms. 


Invocation  of  any  function  thus  luthorized  on  any  entity 
is  still  subject  to  the  password  and  lock  provision 
discussed  earlier  in  this  section.  Thus,  a  user  with 
BDD_OPD  authorization  cannot  modify  an  entity  that  is 
password  protected  unless  the  reguired  password  is 
supplied.   [Ref.  24] 
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DATADICTIDNARY  provides  extensive  capabilites  to  support 
maintenance  and  documentation  of  the  lata  dictionary.  It 
can  be  maintained  by  usiag  either  the  Dnline  maintenance 
facility  or  available  batch  commands.  If  using  the  online 
facility,  a  series  of  screen  panels  will  again  guide  the 
user  through  the  desirel  maintenance  activity.  This 
facility  will  greatly  enhance  individual  changes,  however, 
major  changes  affecting  nanerous  entities  would  be  initiated 
most  easily  through  batch  con.mands.  In  either  case,  mainte- 
nance centers  around  four  principal  functions: 

1.  adding,    copying,    updating,   or   delating   system 
entities 

2.  search   for,   identification   of,    and  creation  of 
entity  aliases 

3.  maintenance  of  descriptors  and  schema  descriptors 

4.  maintenance   of   descriptive  tests   associated   with 
systei  entities 

Similarly,  DAT ADICTIONARY  provides  numerous  report 
generation  capabilities,  most  of  which  can  be  initiated 
through  either  batch  or  Online  Maintenance  sessions. 
Principal  report  types  are  shown  in  Table  11.  Senerated 
reports  will  support  both  the  initial  generation  of  user 
databases  and  subsequent  maintenance  of  system  data  and  the 
structures  utilized  to  displc.y  it. 


TABLE  11 
Principal  Reports  of  DiTADICTIONARY 

INDEX  INDENTED  DETML 

FTELD  TEXT  ALI\S 

DESCPIPTDR     RELATIONSHIPS      DEFINITIONS 
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D.   ORACLE 

ORACLE  is  a  relational  database  management  system  devel- 
oped by  Relational  Software  Incorporated  of  :ienlo  Park, 
CaLifornia.  It  was  originally  developed  for  use  with 
Digital  Equipment  Corporation  PDP  minicomputers  and  has  been 
converted  to  operate  on  IBM  mainfranes  as  well  [Ref.  25]. 
Included  in  ORACLE  is  a  dependent  data  dictionary  that 
performs  a  limited  number  of  the  functions  liscussed  in 
previous  chapters. 

Data  is  stored  in  ORACLE  as  relations,  or  two- 
dimensional  tables,  which  are  organized  into  cows  and 
columns.  SQL  (System  Query  Langaage)  is  used  for  query, 
maaipulation,  definition,  and  control  of  the  ORACLE  data- 
base. Information  about  the  contents  of  a  table,  its 
creator,  authorized  users,  calling  programs,  and  associated 
views  is  kept  in  the  data  dictionary  and  can  be  retrieved 
via  SQL  commands. 

ORACLE'S  logical  hierarchy  of  structures,  as  shown  in 
Figure  5.5,  lemonstrates  the  comparative  simplicity  of  this 
system.  In  this  figure,  a  single  arrowhead  represents  a 
one-to-one  relationship  while  the  doable  arrowheads  signify 
one-to-many  relationships.  The  iataoase  is  iivided  into 
logical  partitions  which  can  only  be  created  or  altered  by 
the  database  administrator.  When  users  define  tables,  the 
system  allocates  memory  for  one  indexspace  and  one  data- 
space.  The  indexspace  is  used  by  the  database/dictionary  to 
store  information  about  the  table  while  the  iataspace  is 
utilized  for  storing  the  actual  information.  As  data  is 
entered  into  the  database,  the  system  automatically  appends 
extents  (and  pages)  as  necessary  to  sipport  specific  tables. 

ORACLE'S  13  data  dictionary  tables  are  described  in 
Figure  5.6.  in  example  of  one  of  the  tables,  CATALOG, 
appears  in  Figure  5.7.   Tables  with  the  "SYS"  prefix  include 
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1 

ORACLE  DAfABASE 

1 

V 

1 

PARTITION  (3) 

1 

1 

7 

V 

1 

TABLE  (S) 
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1 
1 

J     INDEXSPACE     J 


7 

7 

EXTENT  (s)     j 

I 

V 
7 

PA3E(S)       | 


.....L... 

DATASPA3E 


EXTENT  (5> 

I 

V 
7 

PAGE  (s) 


Figure  5.5   ORACLE'S  Logical  Hierarchy 

information  on  system  data  in  addition  to  the  user's  data. 
Foe  example,  a  display  of  SYSCATALOG  might  appear  as  Figure 
5.3.  In  this  particular  example,  there  are  23  entries,  18 
of  which  are  system  tables  or  views. 

ORACLE'S  lata  dictionary  is  automatically  updated  when- 
ever any  additions  or  deletions  are  made  to  tie  database  or 
whan  views  are  defined  or  user  privileges  are  changed,  so  it 
always  has  a  current  description  of  the  dataDise.  As  an 
example,  assuae  a  new  view,  NA7Y7EEA,  is  created  using  the 
SQL  CREATE  command; 

JFI>  CREATE  VI  EH  NAVYVIE?  AS 

2  SELECT  NAME, SSN, RANK 
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DTAB 

-  Description  of  tables  8  views  in  Oracle  Data 
dictionary 

SYSCATALD3 

-  Profile  of  tables  S  views  accessible  to  user 
CATALOG 

-  Profile  ot  tables  accessible  to  user,  excluding 
data  dictionary 

TAB 

-  List  of  tables,  views,  clusters,  and  syaonymns 
created  by  user 

SYSCOLUHNS 

-  Specifications  of  columns  in  accessible  tables 
and  views 

COLUMNS 

-  Specifications  of  columns  in  tables  (excluding 
data  dictionary) 

COL 

-  Specifications  of  columns  in  tables  created 
by  the  user 

SYSINDEXES 

-  List  of  indexes,  underlying  columns,  creator, 
and  options 

INDEXES 

-  Indexes  created  by  user  5  indexes  on  tables 
created  by  user 

SPACES 

-  Selection  of  space  definitions  for  creating 
tables  S  clusters 

VIEWS 

-  Quotations  of  the  SQL  statements  upon  waich 
views  are  based 

SYSTABAUTH 

-  Directory  of  access  authorization  granted  by 
or  to  the  user 

EXTENTS 

-  Data  structure  of  extents  within  tables 
STORAGE 

-  Data  and  Index  storage  allocations  for  user's 
own  tables 

SYSST0RAG3 

-  Summary  of  all  database  storage  --  for  DBA 
use  only 

SYSUSERAOTH 

-  Master  list  of  Oracle  users  --  for  DBA  use  only 
SYSEXTENT5 

-  Data  structure  of  tables  throughout  system 

—  for  DBA  use  onlv 
PARTITIONS 

-  File  structure  of  files  within  partitions 

—  for  DBA  use  only 


Figure  5.6    Tables  of  the  ORACLE  Data  Dictionary 


3  FROM    STUDENTS 

4  HHERE    SERVICE    =    "OSN" 

7iew   created. 
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NAME 

STDDENIS 
ARHYVIEH 


CREATOR 

LANDIN 
OWENS 


TABTYPE 

TABLE 
VIEW 


TABID 

228609 
268300 


j 


Figure  5.7    ORACLE  CATALOG  Listing 


1 -   - 

NAME 

CREATOR 

TABTYPE 

TABID 

HELP 

SYSTEM 

TABLE 

9985 

DOAL 

SYSTEM 

TABLE 

10497 

STORAGE 

SYSTEM 

VIEW 

11520 

EXTENTS 

SYSTEM 

VIEW 

11776 

SPACES 

SYSTEM 

VIEW 

12288 

SYSCOLUSNS 

SYSTEM 

VIEW 

12544 

COLUMNS 

SYSTEM 

VIEW 

12800 

SYSCATAL3G 

SYSTEM 

VIEW 

13056 

CATALOG 

SYSTEM 

VIEW 

13312 

SYSINDEXES 

SYSTEM 

VIEW 

13568 

INDEXES 

SYSTEM 

VIEW 

13824 

VIEWS 

SYSTEM 

VIEW 

14080 

SYSABAUTH 

SYSTEM 

VIEW 

14336 

TAB 

SYSTEM 

VIEW 

14843 

COL 

SYSTEM 

VIEW 

15104 

EXPTAE 

SYSTEM 

VIEW 

15360 

EXPVEW 

SYSTEM 

VIEW 

15616 

DTAB 

SYSTEM 

TABLE 

15873 

STUDENTS 

LANDIN 

TA3LE 

228609 

ARMYVIEW 

OWENS 

VIEW 

268800 

Figure  5.8   ORACLE  SYSCATALOG  Listing 

Upon  completion  of  this  dialog,  all  ORACLE  lata  dictionary 
files  will  have  been  automatically  updated  to  include  the 
new  view.  The  CATALOG  table  would  now  appear  as  shown  in 
Figure  5.9. 

ORACLE  prDvides  security  by  using  its  data  dictionary  to 
coatrol  access  within  the  database.  The  database  adminis- 
trator (DBA)  provides  the  first  level  of  access  by  entering 
the   user's   name   into  the   data   dictionary's   SYSUSERAUTH 
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NAME  CREATOR  IABIYPE  TABID 

STUDENTS  LANDIN  TABLE  228609 

ARMYVIE5T  OWENS  VIEW  268800 

NAVYVIES  LANDIN  VIEW  288240 


Figure  5.9    ORACLE  CATALOG  Listing  ffith  New  View 

table.  Initial  privileges,  oc  subsequent  changes  to  author- 
ized privileges,  are  issued  using  the  GRANT  or  REVOKE 
commands.  ORACLE  also  supports  multi- layered  access:  in 
addition  to  privileges  authorized  by  the  D3A,  a  user  can 
grant  various  degrees  of  access  privilege  to  others  for 
tables  or  views  which  he  or  she  has  created.  A  list  of 
current  authorizations  is  maintained  in  the  dictionary's 
SYSIABAUTH  view,  as  shown  in  Figure  5.13. 

ORACLE  is  a  strong  performer  in  the  data  integrity 
category.  Since  the  data  dictionary  is  an  integral  part  of 
the  database  system,  data  is  only  maintained  at  one  location 
within  the  database.  This  prevents  two  users  from  acquiring 
data  from  the  database  and  getting  different  results.  If 
data  were  duplicated  within  the  system,  it  woull  be  possible 
for  one  location  to  be  updated  while  the  otaer  was  not. 
Figures  5.7  through  5.10  show  that  the  ORACLE  user  will  deal 
mostly  with  subsets  of  the  database,  or  subschemas. 

ORACLE'S  documentation  is  limited  to  the  information 
that  can  be  found  in  the  data  dictionary  tables.  It  does 
not  provide  information  about  which  users  use  which  data, 
how  often  data  is  used,  or  when  it  is  used.  ORACLE  does 
support  maintainability  through  automatic  uplate  of  its 
tables  and  through  the  concept  of  data  independence.  This 
concept  implies  a  separation  of  data  definitions  from  the 
programs   or   queries  that   might   access   the  data   in   the 
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Figure  5.13    ORACLE  SYSTABAUTH  Listing  for  Jser  Owens 
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database.  This  allows  tha  structures  or  definitions  of  the 
data  constructs  to  be  modified  without  necessitating  changes 
in  the  programs  or  queries  that  access  the  database.  If  a 
table  is  extensively  modified,  a  view  can  be  created  to 
interface  with  current  programs.  ORACLE'S  data  integrity 
will  maintain  the  currency  of  tie  view  by  automatically 
upiating  the  view  whenever  applicable  portions  of  the 
governing  table  are  modified. 

ORACLE  does  provide  the  basic  functions  of  definition, 
update,  retrieval,  and  software  interface.  Sowever,  like 
other  relational  database  managment  systems  *ith  dependent 
data  dictionaries,  it  does  not  offer  the  range  of  functions 
of  the  other  data  dictionaries  discussed  in  this  chapter, 
noc  does  it  accomplish  satisfactorily  the  three  main  objec- 
tives of  data  management  discussed  in  Chapter  17.  ORACLE'S 
data  dictionary 


provides  little  more  than  a  nathod  of  defining  the 
schema.  The  relational  database  management  system  'dic- 
tionary* arises  because  the  systam  needs  a  wiv  to  store 
the  schema  and  it  does  this  through  the  use  of  the  same 
tables  (relations)  as  it  uses  foe  the  main  database. 
:Ref-  26]. 


ORACLE  could,   however,   serve  as   a  good  starting  point  for 
further  development. 


The  modern  relational  DBMS  does  provide  a  very  good 
basis  for  a  good  dictionary  system.  This  is  because  the 
normal  relational  DBMS  is  equipped  with  two  features 
that  help  in  making  the  implementation  easy: 

1.  Many  relational  DBMS  now  have  a  "triggering"  feature 
that  causes  a  procedure  to  be  invoked  on  some  data 
condition  or  event.  Such  a  feature  is  needed  to  tie  a 
DBMS  to  a  dictionary  system. 

2.  The  availability  of  the  schema  tables  substantially 
reduces  the  effort  in  implementing  the  dictionary 
system.   [Ref.  27] 


The  most   important  shortcoming  of  ORACLE'S   dati  dictionary 
is  its  lack  of  documentation,   witaout  which  it  is  difficult 
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to  manage  all  aspects  of  an  organization's  data.  If  this 
objective  were  incorporate!  into  the  system,  38. \  CLE  would  be 
a  nuch  more  valuable  tool. 

E.   COHPARISON     OF      DATA     DESIGNER,       DATAMANAGER, 
DATADICTI3NARY,  AND  ORACLE 

Now  that  four  representative  samples  of  commercial  data 
dictionaries  have  been  evaluate!,  we  will  compare  the 
primary  features  of  each  and  identify  which  one(s)  have  come 
closest  to  providing  the  features  of  our  ideal  system.  For 
ease  of  comparison,  we  have  grojped  all  of  the  features, 
functions,  an!  guidelines  that  have  been  identified  into  the 
six  evaluation  criteria  categories:  system  standard  schema 
£•  extensibility,  command  and  query  languages,  ease  of  use 
(including  menus),  security,  documentation  ani  reports,  and 
application  interfaces. 

As  the   data  dictionaries  are   evaluated  in  each   of  the 

si*  categories,   a  brief  chart  will   be  used  to  compare  each 

dictionary   against  the   FIPS  standards.     Each  chart   will 

compare  five  lata  dictionaries; 

FIPS  =  The  ideal/FIPS  data  dictionary 

MSP  =  MSP  DATAMANAGER 

ADR  =  ADR  DATADICTIONARY 

DDE  =  DATA  DESIGNER 

ORA  =  ORACLE  DBMS/DD 

A  ?ery  subjective  scoring  system  will  be  used,  with  grades 
ranging  from  three  to  zero.  The  ideal/FIPS  standard  will 
automatically  receive  a  grade  of  "3"  in  each  irea,  repre- 
senting the  ileal  combination  of  features.  The  meaning  of 
each  grade  is  as  follows: 

"3"  =  Very  strong  performance  by  DD ;  no  criticism 
"2"  =  Good  performance  by  DD ;  one  or  more  significant 

shortcomings 
"1"  =  M)  DD  supports  functional  area  very  poorly; 

(2)  DD  does  not  support  functional  area,  but 

another  component  of  the  system  does. 
"0"  =  DD  (and  remainder  of  system)  fails  to  support 

this  function 
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First,  tha  data  dictionary  should  provide  a  system  stan- 
dard schema  aid  the  capability  to  add  new  entities,  rela- 
tionships, and  attributes  to  it-  \s  shown  In  Table  12, 
while  DATADICTIONARY  and  DATAMANA3ER  closely  resemble  the 
ideal  system  proposed  by  the  FIPS,  D&T&  DESISNER  and,  in 
particular,  D8ACLE  fail  to  provide  these  capabilities. 
DATAMANAGER  supports  three  "add-on"  collections  of  schema 
descriptors.  When  added  to  the  standard  schema,  each  will 
increase  DATAIft NAGER*s  capabilites  to  support  a  specific 
application,  e.g.,  programming. 


TABLE  12 
Category  One:  Schemas  and  Extensibility 


Functional   Category 

FIPS 

.ISP 

ADR    | 

DDE 

OF.  A 

System   Stand.    Schema 

Entity-types 
Relationship- types 
Attribute-types 

3 
(10) 

till 

3 
[111 

3 

(20) 
13) 
(50  +  ) 

1 

0 

ill 

DA/User    Extensible 

3 

3 

3 

0 

0 

Cate'gory    Subtotals 

6 

6 

6 

1 

0 

Second,  the  data  dictionary  should  provide  a  command 
language  that  will  support  guaries  from  users  while 
reserving  some  capabilities  solely  for  the  use  of  the 
dictionary  administrator.  This  last  ingrediant  supports 
security  and  data  integrity.  Again,  as  seen  in  Table  13, 
DATADICTIONAR?  and  DATAMANAGER  provide  all  capabilities  of 
tha  FIPS  standard  while  the  other  two  lag  behind. 
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TABLE  13 
Category  Two:  Command/Query  Languages 


Functional  Category 

FIPS 

asp  , 

ADR  , 

DDE  |  OEA 

CMD  Interface  Lang. 

3 

3 

3 

::i::~i:~ 

Query  Commands 

3 

3   , 

3 

1  i  1 

DA-Only  Commands 

3 

3 

3 

d      1   2 

Category  Subtotals 

"  9 

9 

9 

4   ]   5 

Third,  the  ideal  data  dictionary  must  be  relatively  easy 
to  use,  yet  still  powerful  enough  to  support  the  experienced 
user.  One  of  the  major  ingredients  of  user-friendliness  is 
a  menu-driven  (or  panel-driven)  format.  Good,  easy-to- 
understand  examples  are  another  important  ail  to  the  new 
user.  Table  1 '4  reveals  that,  in  our  opinion,  none  of  the 
four  systems  :aa  be  considered  easy  to  use.  Looking  at  the 
four  as  a  group,  two  fail  to  use  nenus,  one  provides  exam- 
ples which  ace  complex  and  hard  to  understaad,  and  the 
fourth  fails  to  provide  either  menus  or  good  examples. 

Fourth,  security  is  one  of  the  primary  objectives  of  a 
data  dictionary.  It  should  not  only  be  abLe  to  control 
general  access  to  the  system,  but  should  also  support  the 
capability  to  provide  different  levels  of  access  to 
different  usees.  In  Table  15,  three  of  the  four, 
DATADICTIONARZ,  DATAMANAGER,  and  ORACLE  receive  high  marks 
for  providing  both  aspects  of  security.  Security  for  infor- 
mation contained  within  DATA  DESIGNER  must  be  provided  by 
the  parent  DBMS. 

Fifth,  the  ciearness  and  logical  layout  of  system  docu- 
mentation should  be  considered.    additionally,   the  reports 
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TABLE  14 
Category  Three:  Relative  Ease  of  CTse 


Functional  Category 

FIPS 

.ISP  |  ADR 

DDE 

ORA 

Menu-Driven 

3 

--5---5- 
------ 

D 

0 

New  Oser  Friendly 

3   | 

3 

2 

Good  Setup  Example 
in  Documentation 

3 

1      2 

3 

3 

Category  Subtotals 

9   i 

2   I   7 

0 

c 

I 


J 


TABLE  15 
Category  Four:  Security 


Functional  Category 

FIPS 

ISP 

ADR 

DDE 

ORA 

Access   Control 
(Password) 

3 

3 

1 

2 

Degrees  Df  Access 
(Levels) 

3 

3 

3 

1 

3 

DA-only  Privileges 

3 

3 

3 

2 

3 

Category  Subtotals 

9 

9   j 

9 

4   J 

8 

and  the  documentation  prepared  by  the  data  dictionary  must 
be  evaluated  for  usability.  As  indicated  in  Table  16,  each 
of  the  four  lata  dictionaries  approaches  that  of  our  ideal 
FIPS  standard.  It  is  interesting  to  note  that  the  two 
f rontrunners,  DAIADICTION ARY  and  DATAMANAGE3,  have  some 
problems  with  documentation  complexity. 
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TABLE  16 
Category  Five:  Documentation 

Functional  Category     FIPS    ISP 

and  Reports 

ADR    DOE 

OR  A 

SYS  Documentation 
clear/laid  out  well 

3 

2 

2 

3 

3 

Good  Examples  of 
Report  Types 

3 

2 

3 

3 

3 

Reports  Readable 

3 

3 

3 

3 

3 

Category  Subtotals 

9 

7 

8 

9   1   9 

Finally,  the  ideal  data  dictionary  should  support  a 
variety  of  applications,  interfacing  with  both  DBMS  and 
programming  languages.  DATADESISNER  and  DATAMANAGER  both 
provide  interfaces  to  one  or  more  DB.1S  and  to  two  or  more 
programming  Languages.  Table  17  pertains.  While  DATA 
DESIGNER  and  DRACLE  only  interact  with  their  system  DBMS, 
DATAMANAGER  provides  flexibility  and  versatility  by 
supporting  several  popular  DBMS. 


TABLE  17 
Category  Six:  Application  Interfaces 


Functional  Category 

FIPS 

asp 

ADR 

DDE 

OR  A 

DBMS  Interface (s) 

3 

3 

2 

1 

1 

Language  Interfaces 

3 

3 

3 

1   J 

1 

Category  Subtotals 

6 

6 

5 

2 

2 
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When  total  "scores"  ara  calculated,  the  results  are  as 
shown  in  Table  18.  While  none  of  the  systems  provides  all 
of  the  characteristics  of  the  idaal/FIPS  system,  ADR 
DAIADICTIONAR?  and  MS?  DATAMANAGER  come  the  closest.  If  an 
organization  »ere  starting  "fresh",  with  no  pra/ious  invest- 


TABLE  18 
Data  Dictionary  Comparison  Totals 


Functional  Category 

FIPS 

5SP 

ADR 

DDE 

ORA 

Sc hemas/Ext en sible 

6 

6 

6 

1 

0 

Command/2uery  Lang. 

9 

9 

9 

4 

5 

Ease-of-Qse 

9 

2   I 

7 

5 

5 

Security 

9 

9 

9 

4 

8 

DocumentatioQ/Rpts 

9 

7 

8 

9 

9 

Application  Inter. 

6 

6 

5 

2   J 

2 

Comparison  Totals 

48 

39 

44 

25 

29 

ment  in  software,  the  ADR  family  of  products,  RIME,  warrants 
serious  consideration.  If,  on  the  other  hand,  the  organiza- 
tion already  has  one  of  the  popular  DBMS,  and  is  simply 
seeking  to  add  a  new,  or  batter,  data  dictionary,  the  free- 
standing DATA3ANA3ER  might  very  wall  satisfy  the  need.  In 
ea:rh  of  these  two  excellent  commarcial  packages,  the 
observed  shortcomings  lie  in  the  areas  of  usee  friendliness 
and  clear  examples  for  new  users.  Although  important 
requirements,  these  faults  will  be  overcome  as  the  users 
gain  experience. 

In  the  case  of  the  othar  two  dictionaries,   their  short- 
comings would  be  far  harder  to  forgiva.    Their  problems  lie 
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in  areas  of  standard  schemis,  extensibility,  security,  etc. 
Eash  seems  more  user-friendly,  bat,  since  they  do  less, 
there  are  fewer  procedures  to  be  explained-  DMA  DESIGNER, 
although  an  interesting  package,  simply  does  not  provide 
several  of  the  primary  characteristics  that  we  expect  to 
find  in  an  ideal  data  dictionary.  ORACLE  is  certainly  the 
weakest  of  the  four  dictionaries  we  evaluated.  As  part  of 
the  ORACLE  DBMS,  this  system  does  provide  some  data 
dictionary  features.  However,  it  is  not  the  full-featured 
data  dictionary  we  would  recommend. 
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VI.  EXPANSIONS  OF  THE  ROLE  OF  DATA  DICTIONARIES 

In  this  chapter  we  will  suggest  ways  in  which  the  cole 
of  the  data  dictionary  can  be  expanded  beyond  the  basic  uses 
discussed  in  previous  chapters.  He  will  look  first  at  how 
the  data  dictionary  can  enforce  standards  in  today's 
increasingly  common  distributed  data  processing  environment. 
Then  we  will  show  how  the  process  of  decision  making  can  be 
supported  through  the  use  of  a  dati  dictionary.  In  conclu- 
sion, we  will  attempt  to  foresee  where  data  dictionary  tech- 
nology will  lead  information  resource  management  in  the 
years  to  come. 

A.   DISTRIBUTED  DATA  PROCESSING 

Our  discussion  of  databases  up  to  this  point  has 
ceitered  around  the  assumption  that  an  organization  has  one 
centralized  database,  with  centralized  database  management 
and  control,  that  would  be  accessei  by  all  users.  However, 
many  organizations  have  decided  to  distribute  computing 
power  to  various  departments  and/or  outlying  sites, 
depending  on  the  organization's  structure.  la  such  a  situ- 
ation, it  is  also  likely  that  the  organization's  database 
will  have  to  be  distributed.  A  distributed  database  is  "a 
coasistent,  logically  interrelated  collection  of  data  stored 
at  dispersed  locations"  [Ref.  28].  These  dispersed  loca- 
tions, called  nodes,  are  connected  by  means  of  a  network 
which  allows  the  nodes  to  communicate. 

Many  factors  have  contributed  to  the  increasing  popu- 
larity of  distributed  processing.  Two  of  the  most  important 
are  the  following:   [Ref.  29] 
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1.  Numerous  advances  in  technology  that  have  provided 
more  powerful  processing  hardware  at  bwer  cost  and 
improved  communication  and  network  capabilities. 

2.  The  need  for  faster  and  easier  access  to  time- 
critical  information  to  assist  in  the  decision 
making  of  organizations  with  geographically 
dispersed  components  requiring  unified  information 
sharing  and  processing-  (This  concept  will  be 
discussed  in  detail  in  the  next  section. ) 

For  organizations  that  employ  a  centralize!  approach  to 
control  widely-dispersed,  autonomous  divisions,  an  attempt 
to  adhere  to  the  traditional  concepts  of  centralized  infor- 
mation resources  may  be  ineffective  and  self-defeating. 
These  organizations  might  be  tempted  to  sacrifice  the 
ability  to  better  satisfy  user  needs  in  order  to  preserve 
control  and  traditional  relationships.  Fortunately, 
managers  are  rapidly  becoming  awace  of  the  many  potential 
advantages  of  distributing  some,  or  all,  of  the  organiza- 
tion's data  processing  functions  to  the  user  level. 
Technological  advances  continue  to  encourage  these  changes 
because 

The  availability  of  major  computing  resources  in  small, 
low-cost  packages  allows  the  dedication  and  distribution 
of  needed  capabilities,  either  standing  alone  or  inter- 
connected, when  and  where  they  are  needed.  lany  of  the 
complexities  of  centralized  large-scale  computing  facil- 
ities are  no  longer  necessary.  \  Ref .  30] 

It  is  important  to  remember,  however,  that 

the  complexities  of  integrated  systems  require  digital 
data  communications,  appropriate  software,  and  extensive 
planning  and  coordination.  These  complexities  should 
not  be  underestimated.   [Ref.  30] 

One  very  successful  corporation,  Hewlett-Packard, 
utilizes  a  combination  of   centralized,   decentralized,   and 
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distributed  systems  to  support  a  variety  of  needs  within  the 
organization.  Corporate  planning,  employee  banefits,  and 
establishment  of  standards  are  performed  on  mainframes 
located  at  central  management.  Daily  operations,  lata 
processing,  and  employee  pay  and  records  have  been  decen- 
tralized and  are  independently  performed  by  each  division. 
Other  functions,  a.g.,  customer  sales  and  support,  have  been 
distributed  to  increase  responsiveness  and  timeLiness. 


Successful  systems  put  the  control  of  the  data  close  to 
the  source  of  the  information  and  the  control  of 
processing  close  to  tha  manager  responsible  for  the 
function  being  performed.  In  an  organization  like 
3ewlett-Pack.ard,  this  will  frequently,  but  not  always, 
imply  distributing  tha  processing.  Distributed 
processing  has  made  it  possible  for  us  to  adapt  to  a 
constantly  expanding  geographic  operation,  and  a 
constantly  changing  organizational  structure,  while 
maintaining  consistent  administrative  support. 
;Ref.  31] 


Another  class  of  organization  includes  ticse  that  have 
become  so  large  and  dispersed  that  they  simply  cannot  be 
supported  effectively  by  totally  centralized  resources.  The 
arned  services  are  prima  examples  of  this  type.  For 
example, 

In  an  organization  as  large  and  decentralized  as  the 
Navy,  it  would  be  impossible  and  inappropriate  to  impose 
centralized  control  over  the  thousands  of  individual 
small  system  applications  that  are  clearly  being  put  to 
productive  use.  In  fact,  their  main  strength  is  their 
ability  to  solve  many  of  the  inf  omation-handling  prob- 
lems of  users  at  the  local  level,  without  tae  need  for 
centralized  software  development  and  procurement  delays. 
*Ref.  32] 


In  the  years  ahead,  a  growing  awareness  of  these  conditions 
will  drive  an  ever- increasing  numner  of  military  organiza- 
tions to  distribute  soma  portion  of  their  information 
resource  needs. 

Data   dictionaries   that   are   designed   for   operations 
within   distributed  environments   will   reguire   all  of   the 
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capabilites  of  those  operating  solely  in  a  centralized  envi- 
ronment. However,  a  distributed  data  dictionary  must 
support  three  specialized  functions  in  addition  to  basic 
data  dictionary  functions: 

1.  the  ability  to  locate  data  witain  the  network 

2.  the  coordination/management  of  distributed  data 

3.  the   ability   to   perform    data   transformation  in 
support  of  user  applications 

The  distributed  data  dictionary's  directory,  function  enables 
it  to  identify  which  network  node  contains  the  specific 
information  that  is  needed.  Whether  the  particular  database 
is  distributed  by  replication  or  partitioning,  the  data 
dictionary  must  provide  information  about  its  logical  and 
physical  characteristics. 


In  the  case  of  replicated  data,  where  functionally  iden- 
tical copies  of  the  data  are  stored  at  multiple  nodes  in 
the  network,  the  distributed  DD/D3  [lata  dictionary] 
nust  have  knowledge  of  the  known  redundancies  throughout 
the  network.  Synchronization  of  updates  in  tais  case  is 
critical.   *Bef-  33] 


In  a  partitioned  database,  where  only  certain  portions  of 
the  database  are  located  at  individual  nodes,  the  data 
dictionary's  cole  becomes  even  more  important  because  "it 
must  know  the  relationships  among  the  pieces,  and  be  able  to 
manage  all  the  parts,  such  that  this  physical  dispersion  of 
the  data  is  transparent  to  the  user"  [ Ref -  34].  Finally, 
the  distributed  data  dictionary  may  be  required  to  perform 
transformation  of  data  to  support  various  users.  If  serving 
a  heterogeneous  network--one  in  which  dissimilar  types  of 
hardware  and  software  coexist — the  data  dictionary  will  have 
to  translate  between  different  data  and  storage  structures. 


The  distributed  DD/DS  r data  dictionary]  can  facilitate 
these  translation  processes  by  providing  the  metadata 
Lappings  to  allow  the  source  to  be  transformed  into  the 
target  data.  This  is  accomplished  by  storing  in  the 
data  dictionary  the  source  and  target  metadata  descrip- 
tions to  be  used  by  the  mapping  process.   [Bef.  35] 
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It  is  possible  for  the  distrioution  of  dictionary  capa- 
bilities to  be  accomplished  by  several  alternative  configu- 
rations. One  possible  coaf iguration,  as  mentioned  earlier, 
involves  duplicating  the  data  dictionary  in  its  entirety  at 
each  node  of  the  network.  An  eximpia  of  this  is  shown  as 
Figure  6.1.  (Dashed  lines  indicate  node-to-node  communica- 
tions  and   dotted  lines   indicate   dictionary-to-dictionary 


--|   Network  Node    | j   Network  Sode    | 

|  DATA  DICTIONARY  j  |  DATA  DICTIONARY  | 


|  DATA  DICTIONARY  |  |  DATA  DICTIONARY  \ 

I   Network  Node    | J   Network  9ode    I 


i 


Figure  6.1    Duplicated  Data  Dictionaries 

communications.)  Each  data  dictionary  will  contain  a 
complete  copy  of  the  entire  organization's  metadata.  While 
the  nodes  themselves  will  interact  frequently,  the  various 
copies  of  the  dictionary  will  not.  However,  when  one  copy 
of  the  dictionary  is  updated,  all  other  copies  Bust  be  auto- 
matically updated  if  data  integrity  is  to  be  maintained. 
This  duplication  of  metadata  will  result  in  some  degree  of 
additional  overhead,  but  it  will  improve  the  responsiveness 
of  the  system  and  minimize  the  necessity  of  inter-data 
dictionary  queries.  In  some  implementations,  communication 
costs  can  be  significantly  reduced.  This  configuration  will 
be  most  desirable  in  cases  in  which  the  organization's  data- 
base (s)   are  also  duplicated  at  each   node  or  if  nodes  would 
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be  likely  to  access  each  other's  metadata  often.  A  stable 
organization  with  well-established  data  processing,  where 
metadata  is  not  continuously  being  updated,  would  benefit 
most  from  this  configuration. 

In  the  second  configuration,  the  lata  dictionary  is 
partitioned  anong  the  various  network,  nodes.  As  shown  in 
Figure  6.2,  each  node  contains  only  that  portion  of  the 
dictionary  that  contains  the  metadata   it  reguices.    No  one 


Network  Node    | j   Network  Node    ] 

DD  Partition    |  j   DD  Partition    | 


1   DD  Partition   ]  |   DD  Partition 

—  J   Network  Node    j j   Network  Nsde 


Figure  6.2    Partitioned  Data  Dictionary  (DD) 

node  or  station  within  the  system  will  have  a  complete  data 
dictionary.  This  configuration  wDuld  be  used  when  there  is 
not  much  need  for  the  nodes  of  the  network  to  access  each 
other*s  metadata  and  there  is  a  relatively  cleic-cut  differ- 
entiation between  the  functions  being  carried  on  at  each 
node,  which  implies  different  metadata.  Because  redundancy 
is  kept  to  an  absolute  minimum,  problems  could  arise  if  a 
node's  data  dictionary  partition  were  lost  unless  good 
backup  procedures  were  in  effect.  Since  each  node  is  only 
responsible  for  maintaining  its  own  portion  of  the  whole, 
there  is  littLe  update  overhead  and  thus  little  system  delay 
as  long  as  the  required  metadata  exists  at  that  particular 
node. 
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In  the  final  configuration,  the  lata  dictionary  is 
distributed  in  a  hierarchical  structure.  There  will  be  one 
"master"  copy  of  the  dictionary  and  ona  or  more  partial 
copies  throughout  the  network,  as  shown  in  Figure  6.3.  In 
this  configuration,  each  node  that  contains  a  portion  of  the 


Network  Node 
DATA  DICTIDNAR? 


J  DD  Partition  |   1  DD  Partition  | 
--|  Network  node  |  —  J  Network  Node  |- 


|  DD  Partition  |  , 
-|  Network  Node  | — 


Figure  6.3    Hierarchy  of  Distributed  Data  Dictionaries 

data  dictionary  is  responsible  for  updating  the  master 
dictionary  whenever  its  portion  is  modified.  This  structure 
ensures  data  integrity  and  provides  flexibility  by  allowing 
varying  amounts  of  metadata  to  be  distributed.  Another  use 
for  this  hierarchical  structure  might  be  to  separate  func- 
tionality within  a  network,  e.g.,  database,  automated 
office,  and  programming  functions.  Each  of  these  functions 
is  able  to  maintain  its  portion  of  the  dictionary  locally 
while  one  master  copy  is  available  to  handle  inter-partition 
queries. 

There  are  presently  several  commerical  packages  in  the 
development  or  testing  stages  that  will  be  able  to  satisfy 
the  requirements  of  distributed  processing.  One  system  that 
is  already  available  and  being  used  in  numerous  applications 
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is  ADR's  Relational  Information  Management  Environment 
(RIME)  system.  As  discussed  in  Chapter  V,  this  system 
features  fourteen  separate  components  that  can  oa  integrated 
into  one  "total"  system.  One  component,  D-NET,  combines  a 
database,  data  dictionary,  and  communications  interfaces  to 
support  the  special  reguirsments  of  distributed  processing. 
D-NET  is  capable  of  supporting  both  homogeneois  and  hetero- 
geneous networks: 


The  flexibility  provided  by  D-NET  and  the  oth  =  t:  software 
components  allows  usees  to  configure  the  iistributed 
system  networks  based  on  the  needs  of  each  node. 
/arious  operating  systems,  computer  types,  and  cooper- 
ating software  products  can  be  used  to  create  a  specific 
environment  without  impacting  application  levelopment 
and  operations.   [ Ref -  36] 


D-SET  can  implement  the  system's  data  dictionary, 
DAIADICTIONARy,  as  either  one  centralized  dictionary  or  as 
multiple  copies  stored  at  remots  locations.  Similarly, 
RIME*s  database,  DAIACOM/DB,  can  be  maintained  either  at  one 
centralized  location  or  distriautei  to  various  nodes 
throughout  the  network.  D-NEI  serves  as  the  basis  of  the 
Amy's  project  VIABLE,  providing  numerous  benefits  that 
include  cost  effectiveness,  highly  expandable,  increased 
productivity,  resource  control  and  synchronization,  and 
independent  operation  at  the  local  usee's  level. 

B.   DECISION-BAKIHG 

In  this  section  we  will  show  how  the  data  dictionary 
provides  managers  with  the  efficiantly  recordal,  accurate, 
and  timely  information  necessary  to  make  decisions  in  conso- 
naace  with  the  goals  of  the  organization,  whether  in  a 
centralized  or  distributed  environment.  According  to  the 
report  of  the  Committee  on  Review  of  Navy  Long-Range  ADP 
Planning,  "information  technology",  which  includes  data 
dictionaries,  is 
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critical  to  the  Navy's  ability  to  fulfill  Its  wartime 
and  peacetime  roles  in  an  optimum  mamer.  The  available 
technologies  would  enable  the  Navy  to  approach  its 
missions  with  information  and  data  that  (1)  have  been 
collected  and  recorded  simply,  (2)  have  improved  accu- 
racy, (3)  have  been  speedily  reported,  collated,  and 
distributed.  (1)  lead  to  summaries  that  are  timely  and 
to  the  point,  as  and  when  needed,  and  (5)  aive  enabled 
both  manpower  commitments  and  costs  to  be  reduced. 
[Sef.  37] 


1 •   lk§  Decision- Making  Process 

Herbert  Simon's  classic  moiel  of  the  dec ision- making 
process,  as  cited  by  Sprague  and  Carlson  [ Ref .  38],  consists 
of  three  distinct  steps:  intelligence,  design,  and  choice. 
The  use  of  a  lata  dictionary  supports  the  decision  maker  as 
he  takes  each  step. 

a-  I^telli cence  involves  searching  the  environment 
for  conditions  calling  for  decisions.  Raw  data  must  be 
obtained,  processed,  and  examine!  for  clues  taat  may  iden- 
tify problems.  However,  so  much  data  is  available  within  an 
organization  that  a  seemingly  infinite  parade  of  information 
can  be  produced  —  this  situation  is  called  information  over- 
load. There  must  be  some  way  of  narrowing  lown  the  amount 
of  information  that  is  presented  to  the  decision  maker.  A 
data  dictionary  used  in  conjunction  with  a  database  can  play 
an  important  role  in  this  narrowing  process.  As  discussed 
earlier  in  the  thesis,  the  dictionary  helps  an  organization 
identify  and  eliminate  redundant  lata.  Its  guery  language 
can  be  used  to  select  infomation  about  a  particular  entity 
and  its  report  definition  capability  cm  be  used  to  generate 
aggregate,  rather  than  detailed  data.  Relationships  betwen 
entities  are  easily  identified  so  that  managers1  questions 
such  as  "What  is  the  range  of  values  for  'Readiness  Status' 
data?"  and  "Which  departments  receive  the  'Ammunition 
Transaction'  report?"  can  be  answered. 
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b-  Design  entails  inventing,  developing,  and 
analyzing  possible  courses  of  action.  Tils  involves 
processes  to  mderstand  the  problem,  to  generate  solutions, 
and  to  test  solutions  for  feasibility.  The  data  dictionary 
plays  a  key  cole  in  documenting  tie  decision  aaker's  envi- 
ronment so  that  he  or  she  will  ha/e  a  centralized  source  of 
information  from  which  to  develop  possible  choices.  The 
dictionary  can  also  be  used  to  tailor  information  to  meet 
specific  needs  by  defining  usar  viaws  of  data  and 
restricting  usee  access  to  certain  data.  In  this  way,  users 
can  be  presented  only  with  the  information  thay  are  suppose! 
to  have  and  naed  to  have,  as  determined  by  higher  authority 
in  the  organization,  instead  of  having  to  deal  with  non- 
essential information. 

In  addition  to  recording  information  about  tha 
plans,  structure,  and  functions  of  the  organization,  the 
data  dictionary  can  also  be  used  to  record  information  about 
the  decision  makers  themselves.  In  the  case  of  the  U.S.S. 
Constellation,  for  example,  information  about  tie  commanding 
officer  and  the  key  elements  of  his  environment  can  be  docu- 
mented: which  decisions  he  wishes  to  make  and  which  ones 
his  subordinates  will  make,  the  mission  assigned  to  the 
carrier  by  tha  CO.  's  superiors,  the  relative  priorities  he 
attaches  to  various  subjects,  his  short  term  and  long  teem 
personal  goals,  previous  decisions  he  has  made,  and  so  on. 

c-  ^Jl2i£2  involves  selecting  a  particular  course  of 
action  from  those  available  and  implementing  that  choice. 
Of  course,  the  ultimate  decision  will  lie  with  the  decision 
maker,  and  not  with  the  data  dictionary.  At  best,  the  data 
dictionary  can  present  options  to  the  iecision  maker  and, 
once  the  choice  is  made,  can  document  the  steps  taken  to 
implement  that  choice. 
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2-   Crisis  Management 

The  accuracy  and  timeliness  of  information  provided 
to  the  decision  maker  becomes  of  critical  importance  when 
the  decision-making  process  occucs  daring  a  crisis  situ- 
ation. In  wartime,  for  example,  thara  is  usually  a  great 
deal  of  risk  associated  with  a  decision:  many  decision 
makers  are  involved,  information  must  .be  consolidated  from  a 
variety  of  sources  and  locations,  little  tims  is  available 
to  make  decisions,  and,  due  to  tha  uniquenass  of  events, 
thare  is  often  no  pre-defined  structure  for  making  the  deci- 
sion. There  are  four  ways  that  the  data  dictionary  can 
prove  especially  helpful  in  crisis  decision-making. 

a.  The  dictionary  speeds  up  the  information- 
gathering  process.  as  discussed  earlier,  usar  views  and 
accesses  have  been  pre-defined  and  can  be  changed  easily  as 
neaded.  Active  data  dictionarias  provide  for  automatic 
update  of  any  changes  that  are  nade,  so  information  is 
always  current. 

b.  The  dictionary  prioritizes  information.  Tne 
priorities  of  the  organization  and  tha  decision  makers  are 
taken  into  account  and  can  be  updatad  as  events  occur.  In 
this  way,  the  attention  of  decision  makers  is  focused  on 
truly  important  information  rathar  than  disparsed  over  a 
wide  range  of  information. 

c.  The  dictionary  provides  a  common  information 
base.  This  is  important  when  many  decision  makers  at 
different  locations  are  involved.  All  participants  have  the 
latest  information  and  can  also  take  advantage  of  the 
"corporate  memory"  provided  by  the  dictionary. 

d.  In  short,  the  dictionary  provides  "intelligent" 
information  management.  It  reduces  information  overload, 
tailors  infomation  to  specific  decision-makecs*  needs,  and 
responds  well  to  infrequent,  ad  hoc  requests.  It  helps  to 
establish  relationships  between  evants  as  they  occur. 
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The  typical,  or  even  "ideal",  data  dictionary  will 
not  be  able  to  fully  support  the  decision-miking  process 
without  the  help  of  additional  sophisticate!  software  to 
taie  advantage  of  its  capabilities.  'He  believe  that  as  the 
acceptance  and  use  of  the  data  lictionary  as  a  tool  for 
information  resource  management  become  widespread,  the 
demand  for  an  expanded  role  for  the  dictionary  will 
increase.  Organizations  must  become  more  accomplished  in 
the  top-down  planning  process  of  the  system  development  life 
cycle  in  order  to  receive  maximum  benefits  from  data 
dictionary  technology. 

C.   CONCLUSIONS 

In  this  thesis,  we  have  discussed  the  structure,  func- 
tions, and  objectives  of  a  data  dictionary.  We  have 
compared  popular  commercial  products  to  an  "ideal" 
dictionary  based  on  criteria  we  developed  and  on  FIPS  DDS 
guidelines.  We  have  analyzed  the  role  of  a  data  dictionary 
in  information  resource  management,  including  its  support  of 
a  distributed  data  processing  environment  and  of  the 
decision-making  process.  It  seems  clear  that  as  organiza- 
tions become  cognizant  of  the  neel  to  manage  taeir  informa- 
tion efficiently,  the  importancs  and  necessity  of  data 
dictionary  implementation  will  continue  to  increase. 

Designers  of  data  dictionaries  are  aware  of  these  trends 
and  are  moving  in  the  following  directions: 


First,  toward  what  is  known  as  an  integrated  data 
dictionary  and  second,  toward  a  free-standing  dictionary 
that  serves  as  a  driver  of  a  distributed  data  processing 
system  made  up  of  several  types  of  computers,  data  base 
management  systems,  file  managers,  and  text  editors. 
-.Ref:  39] 


In  reference   to  the   first  projection,    several  commercial 
systems  have   been  developed  that   feature  intajration   of  a 
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data  dictionary  with  a  database.  Dne  exampLs  of  this  is 
ADR's  RIME  which  features  integratioa  of  a.  database  and  a 
data  dictionary  with  numerous  ottiar  components  to  form  one 
very  capable  and  flexible  system.  Addressing  the  second 
projection,  Rullo  [Bef.  40]  foresees  development  of  a  "super 
data  dictionary"  to  support  future  integrate!  and  distrib- 
uted systems: 


In  this  environment,  the  data  dictionary  would  act  as  a 
driver  of  the  system.  The  data  dictionary/data  direc- 
tory might  also  have  some  integrate!  facilities  permit- 
ting transfer  of  data  among  other  systei  software 
functions  including  itself.  Iiera  is  a  tread  in  this 
direction,  with  other  systems  depending  oa  the  data 
dictionary/iata  directory  and  thit  system  itself  begin- 
ning to  resemble  a  model  of  the  snterprise. 


He  believe  the  future  holds  significant  improvements  and 
expansions  of  data  dictionary  technology.  It  is  important 
that  the  development  of  standards  for  dati  dictionary 
conpatibility  continue  along  with  the  developnent  of  stan- 
dards that  are  currently  being  developed  to  support  network 
conmunications.  It  is  conceivable  that  these  standards,  if 
widely  accepted,  would  allow  any  data  dictionary  to  "talK" 
to  another  and  to  exchange  information.  The  PIPS  DDS  stan- 
dards developed  by  the  National  Bureau  of  Standards  will 
most  likely  become  the  basis  for  data  dictionaries  procured 
and  used  by  the  federal  government. 

We  also  foresee  the  use  of  fourth  generation  languages, 
the  extremely  user-friendly,  "close- to-natural-language" 
languages  that  will  facilitate  user  access  to  the  diction- 
ary's metadata.  These  languages  will  replace  the  formal 
command  languages  and  awkward  syntax  described  earlier  in 
the  thesis.  Another  factor  contributing  to  tae  increased 
utility  of  data  dictionaries  will  be  the  use  of  sophisti- 
cated software  and  artificial  intelligence  techniques  in 
conjunction  with  the  dictionary.    As   the  central  source  of 
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data  about  an  organization,  the  lata  dictionary  contains  a 
broad  base  of  information  upon  whicn  in  artificial  intelli- 
gence "expert"  system  can  be  built.  For  example,  it  is 
possible  that  an  expert  system  would  be  able  to  verify  and 
validate  additions  to  the  dictionary  schema  based  on  pre- 
determined rules  and  information  gained  from  prsvious  manip- 
ulations of  the  schema.  It  would  also  be  able  to  establish 
associations  between  the  contents  of  the  lata  dictionary  and 
flag  them  for  the  attention  of  the  decision  mater.  In  addi- 
tion, a  "smart"  data  dictionary  would  be  abls  to  "realize" 
that  every  time  a  user  logs  on  to  the  system,  he  asks  for 
particular  iif ormation,  so  that  eventually,  the  data 
dictionary  will  provide  it  for  him  automatically. 

No  matter  what  changes  occur  in  data  dictionary  tech- 
nology, the  data  dictionary's  role  in  the  efficient  manage- 
ment of  an  organization's  information  resource  will  continue 
to  be  an  increasingly  important  one.  The  dictionary  will 
support  the  organization  in  its  planning  and  analysis  of 
functions,  its  development  of  information  systems,  the  main- 
tenance of  those  systems,  and  the  intelligent  use  of  those 
systems.  We  believe  that  the  military  will  soon  provide  a 
vast  market  for  data  dictionary  software  and  that  the 
demands  of  its  users  will  drive  data  dictionary  technology 
even  further. 
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appendix  5l 
backus-naur  foes 

Backus-Naar  form  is  a  graphic  notation  for  describing 
the  syntax  of  a  language.  It  is  used  by  the  Federal 
Information  Processing  Standard  for  Data  Dictionary,  Sy_steas 
(FIPS  DDS)  to  show  the  foraat  of  tae  commands  ised  to  manip- 
ulate the  dictionary.  The  following  are  common  Backus-Naur 
syabols  used  by  the  FIPS  DOS: 

<    >   denotes  a  word  or  phrase 

|     indicates  a   choice  between  two  or   nore  alterna- 
tives, "or" 

[    ]   represents  an  option  that  the  user  man  or  may  not 
include 

{   }   is   used  to  set  off  choices  separate!   by  "j"  and 
to  enclose  the  format  of  the  command 

The   syntax   for   the    ADD-ENTIir   command   appears  as 
follows: 

ADD-ENTITY 

{[OF]    {ENTITY-TYPE    |     E-T}     <entl ty-type-name> 

WHERE    NAME    [IS]    <name-clause> 

[WHERE     [ATTRIBUTE    |     A}     [FOR]    <a t tribute-cia is e- 1> 

[,...,    [ attribute-clause-n ]] ] 
WITH    SECaRITY    <security-clause> ]} 

It  indicates  that  there  are  several  different  ways  of  adding 
an  entity  to  the  dictionary.  At  a  minimum,  the  command  must 
include  ENTIT?-TYPE  or  E-T,  an  entity-type  name,  WHERE  NAME, 
and  a  name  clause.  The  words  OF  and  15  are  optional,  as  are 
the    last   two      phrases  set    off    by    brackets.  If    the    phrases 

are  used,  the  same  rules  hold  for  choosing  elements  within 
them. 
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